Penn Annotation as Algebra a formal framework for
Penn Annotation as Algebra: a formal framework for linguistic annotation Mark Liberman University of Pennsylvania myl@cis. upenn. edu (joint work with Steven Bird, Melbourne University) HP Labs Bangalore, 8/21/2003 1
Outline · · · Penn Motivation Sketch of the idea Survey of linguistic annotation Annotation graphs as a formal framework Practical implementations and experience Issues for the future HP Labs Bangalore, 8/21/2003 2
What linguistic annotation is (and isn’t) Penn · “Linguistic annotation” means symbolic descriptions of specific linguistic signals - e. g. transcriptions, parses, etc. · it does not include things like: - metadata § e. g. information about speakers, recordings, documents, etc. § typically stored in RDB referenced by elements of linguistic annotation - lexicons · but these can be treated in a common framework HP Labs Bangalore, 8/21/2003 3
Motivation Penn · A jungle of annotation file formats - e. g. more than 20 common formats for time-marked orthographic transcriptions - Many new formats every year · Multiple annotations of the same data · No good way to search annotations - different coding needed for each format - extra difficulty of searches across formats · Problems for: - tool builders - researchers - corpus builders and maintainers HP Labs Bangalore, 8/21/2003 4
Basic idea #1: what to do Penn · Abstract away from file formats, to the logical structure of linguistic annotation · Replace two-level model with three-level model - as in database technology several decades ago - so many applications can access many kinds of data through a consistent API · Choose a logical structure with good properties - simple, conceptually natural, computationally efficient - algebra to facilitate boolean combination of queries HP Labs Bangalore, 8/21/2003 5
Two-level model: HP Labs Bangalore, 8/21/2003 Penn 6
Three-level model: HP Labs Bangalore, 8/21/2003 Penn 7
Basic idea #2: how to do it Penn · Three kinds of assertion recur in linguistic annotation - assigning a label “This chunk of stuff has property X” - sequencing labels “chunk B immediately follows chunk A” - anchoring the edges of labels “this chunk boundary has coordinates k” (in time, space, text. . . ) · Formalized as a labeled DAG, these primitives provides a logical structure adequate for all linguistic annotation · The result also defines an algebra useful for searching and in other ways HP Labs Bangalore, 8/21/2003 8
Basic assertion type 1: Labeling Penn Associate a “label” (typed, structured symbolic information) with a region of a linguistic signal HP Labs Bangalore, 8/21/2003 9
Basic assertion type 2: sequencing Penn Example: The stretch of signal labeled “this” is followed by a stretch of signal labeled “is” HP Labs Bangalore, 8/21/2003 10
Basic assertion type 3: anchoring Penn Example: The stretch of signal labeled “this” begins 137. 4592 seconds from the start of file XYZ. HP Labs Bangalore, 8/21/2003 11
Informalization Penn · An “annotation graph” (AG) is: - a directed acyclic graph - whose arcs are labeled with fielded records e. g. phoneme=“p” or word=“this” - whose nodes may be labeled with signal coordinates e. g. 3. 45692 seconds · Labeling → arc labels Sequencing → Anchoring → signal coordinates on nodes · That’s all! HP Labs Bangalore, 8/21/2003 12
Outcome Penn API, open source toolkit (C, C++, TCL, Python); sample tools: Java version (“ATLAS”) developed by NIST HP Labs Bangalore, 8/21/2003 13
Annotation formats & tools Penn · Surveyed in 1999 by Liberman and Bird · Documented on web page http: //ldc. upenn. edu/annotation · Used in designing annotation graph system & AG software · Survey is updated periodically HP Labs Bangalore, 8/21/2003 14
Some animals in the annotation zoo 1 2 3 4 5 6 7 8 Penn TIMIT BAS Partitur CHILDES LACITO LDC CALLHOME NIST UTF Switchboard (four types of annotation). . . etc. . HP Labs Bangalore, 8/21/2003 15
Penn Sample TIMIT data train/dr 1/fjsp 0/sa 1. wrd: 2360 5200 she 5200 9680 had 9680 11077 your 11077 16626 dark 16626 22179 suit 22179 24400 in 24400 30161 greasy 30161 36150 wash 36720 41839 water 41839 44680 all 44680 49066 year HP Labs Bangalore, 8/21/2003 train/dr 1/fjsp 0/sa 1. phn: 0 2360 h# 2360 3720 sh 3720 5200 iy 5200 6160 hv 6160 8720 ae 8720 9680 dcl 9680 10173 y 10173 11077 axr 11077 12019 dcl 12019 12257 d. . . 16
Penn TIMIT interpreted graphically 5200 6160 HP Labs Bangalore, 8/21/2003 8720 9680 17
TIMIT as Annotation Graph W = word level 5200 9680 had HP Labs Bangalore, 8/21/2003 Penn P = phoneme level 5200 6160 hv 6160 8720 ae 8720 9680 dcl 18
BAS Partitur Penn Goal: a common format for research results from many German speech projects. A multi-tier description of speech signals: KAN - the canonical transcription ORT - orthographic transcription TRL - transliteration MAU - phonetic transcription DAS - dialogue act transcription HP Labs Bangalore, 8/21/2003 19
Penn BAS Partitur: example KAN: 0 KAN: 1 KAN: 2 KAN: 3 KAN: 4 KAN: 5 KAN: 6 j'a: S'2: n@n d'a. Nk das+ v. E: r@+ z'e: 6 n'Et ORT: 0 ORT: 1 ORT: 2 ORT: 3 ORT: 4 ORT: 5 ORT: 6 ja schönen Dank das wäre sehr nett MAU: MAU: 4160 1119 5280 2239 7520 2399 9920 1599 11520 479 12000 479 12480 479 0 j 0 a: 1 S 1 2: 1 n -1 DAS: 0, 1, 2 @(THANK_INIT BA) DAS: 3, 4, 5, 6 @(FEEDBACK_ACKNOWLEDGEMENT BA) HP Labs Bangalore, 8/21/2003 20
BAS Partitur graphical structure: Penn @(THANK_INIT BA) DAS: ORT: ja sch"onen KAN: j'a: S'2: n@n MAU: 4160 5280 7520 KAN: 0 j'a: ORT: 0 ja MAU: 4160 1119 0 j KAN: 1 S'2: n@n ORT: 1 sch"onen MAU: 5280 2239 0 a: DAS: 0, 1, 2 @(THANK_INIT BA) HP Labs Bangalore, 8/21/2003 21
Partitur differences from TIMIT Penn File organization: everything is in a single file (even metadata) Time marking: time anchors are in only one tier (MAU) time anchors use <start offset, duration-1> Relationship between the tiers: KAN tier supplies a set of identifiers MAU tier: several lines for each KAN line DAS tier: one line for several KAN lines Temporal structure: MAU and DAS define convex intervals HP Labs Bangalore, 8/21/2003 22
Penn BAS Partitur: Annotation graph ORT: 0 ja ORT: 1 sch"onen MAU: MAU: 4160 1119 5280 2239 7520 2399 9920 1599 11520 479 0 0 1 1 1 j a: S 2: n DAS: 0, 1, 2 @(THANK_INIT BA) HP Labs Bangalore, 8/21/2003 23
CHILDES Penn · Child language acquisition data · Archive organized by Brian Mac. Whinney at CMU · CHAT transcription format · Tools for creating, browsing, searching · Contributions by many researchers around the world HP Labs Bangalore, 8/21/2003 24
CHILDES Annotation *ROS: %snd: *FAT: %snd: *MAR: %snd: Penn yahoo. "boys 73 a. aiff" 7349 8338 you got a lot more to do # don't you? "boys 73 a. aiff" 8607 9999 yeah. "boys 73 a. aiff" 10482 10839 because I'm not ready to go to <the bathroom> [>] +/. "boys 73 a. aiff" 11621 13784 HP Labs Bangalore, 8/21/2003 25
CHILDES differences from TIMIT · · Penn long recordings with multiple speakers time specified at turn level only there are gaps between the turns the transcription contains embedded annotations HP Labs Bangalore, 8/21/2003 26
CHILDES annotation graph *ROS: %snd: *FAT: %snd: Penn yahoo. "boys 73 a. aiff" 7349 8338 you got a lot more to do # don't you? "boys 73 a. aiff" 8607 9999 NB: incomplete time info, disconnected structure HP Labs Bangalore, 8/21/2003 27
CHILDES: RDB connection Penn “metadata” about speakers, recordings etc. stored separately in relational tables ID 1 2 3 4 NAME Ross Mark Brian Mary ROLE AGE SEX BIRTH Child 6; 3. 11 male 23 -DEC-1977 Child 4; 4. 15 male 19 -NOV-1979 Father Mother HP Labs Bangalore, 8/21/2003 28
LACITO Penn Langues et Civilisations a Tradition Orale - - recordings of unwritten languages, collected and transcribed over three decades preservation and dissemination Based on XML - markup for alignment to audio signal different XSL style sheets for display § § generating HTML with hyperlinks to audio clips HP Labs Bangalore, 8/21/2003 29
LACITO example Penn <S id="s 1"> <AUDIO start="2. 3656" end="7. 9256"/> <TRANSCR> <W><FORM>nakpu</FORM> <GLS>deux</GLS></W> <W><FORM>nonotso</FORM> <GLS>soeurs</GLS></W> <W><FORM>si&x 014 b; </FORM> <GLS>bois</GLS></W> <W><FORM>pa</FORM> <GLS>faire</GLS></W> <W><FORM>la&x 0294; natshem</FORM> <GLS>allerent</GLS></W> <W><FORM>are</FORM> <GLS>dit. on</GLS></W> <PONCT>. </PONCT> </TRANSCR> <TRADUC lang="Francais"> On raconte que deux soeurs allerent cher du bois. </TRADUC> <TRADUC lang="Anglais"> They say that two sisters went to get firewood. </TRADUC> </S> HP Labs Bangalore, 8/21/2003 30
LACITO as AG Penn <AUDIO start="2. 3656" end="7. 9256"/> <W><FORM>nakpu</FORM> <GLS>deux</GLS></W> <W><FORM>nonotso</FORM> <GLS>soeurs</GLS></W> <W><FORM>si&x 014 b; </FORM> <GLS>bois</GLS></W> <W><FORM>pa</FORM> <GLS>faire</GLS></W> <TRADUC lang="Francais">On raconte que deux. . . </TRADUC> <TRADUC lang="Anglais">They say that two. . . </TRADUC> HP Labs Bangalore, 8/21/2003 31
LACITO discussion Penn Two kinds of partiality for times: - where they are simply unknown where they are inappropriate Unknown times: - the annotation is incomplete time-alignment is coarse-grained Inappropriate times: - for word boundaries in the phrasal translation for punctuation? HP Labs Bangalore, 8/21/2003 32
LDC Call Home example Penn 980. 18 989. 56 A: you know, given how he's how far he's gotten, you know, he got his degree at &Tufts and all, I found that surprising that for the first time as an adult they're diagnosing this. %um 989. 42 991. 86 B: %mm. I wonder about it. But anyway. 991. 75 994. 65 A: yeah, but that's what he said. And %um 994. 19 994. 46 B: yeah. 995. 21 996. 59 A: He %um 996. 51 997. 61 B: Whatever's helpful. 997. 40 1002. 55 A: Right. So he found this new job as a financial consultant and seems to be happy with that. 1003. 14 1003. 45 B: Good. HP Labs Bangalore, 8/21/2003 33
LDC Call. Home as AG Penn 995. 21 996. 59 A: He %um 996. 51 997. 61 B: Whatever's helpful. 997. 40 1002. 55 A: Right. So. . . HP Labs Bangalore, 8/21/2003 34
Call. Home discussion Penn Speaker overlap - - No special devices, just turn time-marks Scales for an arbitrary number of speakers Information about word-level overlap is left ambiguous Additional time references could easily specify word overlap HP Labs Bangalore, 8/21/2003 35
NIST UTF (circa 1999) Penn NIST: National Institute for Standards and Technology (USA) UTF: “Universal Transcription Format” - Intended to generalize over several earlier LDC broadcast news and conversation transcription formats Special treatment for: metadata, time stamps, speaker overlap, contractions N. B. now abandoned in favor of AG-based representations - HP Labs Bangalore, 8/21/2003 36
NIST UTF example (from BN) Penn <turn speaker="Roger_Hedgecock" spkrtype="male" dialect= "native" start="2348. 811875" end="2391. 606000" mode="spontaneous" fidelity="high"> <time sec="2387. 353875"> on welfare and away from real ownership {breath and <contraction e_form="[that=>that]['s=>is]">that's a real problem in this <b_overlap start="2391. 115375" end="2391. 606000"> country<e_overlap></turn> <turn speaker="Gloria_Allred" spkrtype="female" dialect= "native" start="2391. 299625" end="2439. 820312" mode="spontaneous" fidelity="high"> <b_overlap start="2391. 299625" end="2391. 606000"> well i<e_overlap> think the real problem is that %uh these kinds of republican attacks <time sec="2395. 462500"> i see as code words for discrimination</turn> HP Labs Bangalore, 8/21/2003 37
NIST UTF: turn element Penn <turn speaker="Roger_Hedgecock" spkrtype="male" dialect= "native" start="2348. 811875" end="2391. 606000" mode="spontaneous" fidelity="high"> HP Labs Bangalore, 8/21/2003 38
NIST UTF: Contraction Penn <contraction e_form="[that=>that]['s=>is]"> that's HP Labs Bangalore, 8/21/2003 39
NIST UTF: overlap Penn <b_overlap start="2391. 115375" end="2391. 606000"> country <e_overlap> HP Labs Bangalore, 8/21/2003 40
NIST UTF: discussion Penn Relational data (e. g. speaker demographics) is embedded in the annotation (redundantly). Time stamps are stored in three different places. Speaker overlap is convolved with the speaker turn, so time relation with an external event disrupts the internal structure of a turn Contractions are treated in a way that facilitates link to lexicon, but may be hard to ignore in a search function HP Labs Bangalore, 8/21/2003 41
NIST UTF as AG HP Labs Bangalore, 8/21/2003 Penn 42
AG contraction treatment Penn Additional textual annotations: e. g. for expanding a contraction don't complicate the existing representation --facilitates search HP Labs Bangalore, 8/21/2003 43
NIST UTF / AG version Penn Metadata stored in a separate RDB table (cf. CHILDES) Time stamps stored in a single place -- AG nodes Speaker overlap not convolved with the speaker turn so temporal relationship with an external event remains external to the structure of a turn Contractions no new device, easily ignored in search No artificial order on speaker turns HP Labs Bangalore, 8/21/2003 44
Switchboard Penn Corpus of 2400 5 -minute telephone conversations collected at Texas Instruments in 1991 Transcribed and aligned on three levels: conversation, speaker turn, word Subsequently annotated for: POS, syntactic structure, breath groups, disfluencies, speech acts, phonetic segments, etc. Then re-transcribed with many corrections! --Proliferation of layers with different tokenizations --Problem of correction after annotation HP Labs Bangalore, 8/21/2003 45
Penn SWB example (1, 2) B B B B 21. 86 22. 12 22. 38 22. 56 22. 86 23. 88 24. 02 24. 18 24. 52 24. 80 24. 86 24. 98 25. 66 25. 88 0. 26 0. 18 0. 06 0. 32 0. 14 0. 16 0. 32 0. 28 0. 06 0. 12 0. 22 Metric system, no one's very, uh, no one wants it at all seems like. HP Labs Bangalore, 8/21/2003 [ Metric/JJ system/NN ] , /, [ no/DT one/NN ] 's/BES very/RB , /, [ uh/UH ] , /, [ no/DT one/NN ] wants/VBZ [ it/PRP ] at/IN [ all/DT ] seems/VBZ like/IN. /. 46
SWB example (3, 4) Penn B. 22: Yeah, / no one seems to be adopting it. / Metric system, [ no one's very, + {F uh, } no one wants ] it at all seems like. / ((S (NP-TPC Metric system) , (S-TPC-1 (EDITED (RM [) (S (NP-SBJ no one) (VP 's (ADJP-PRD-UNF very))) , (IP +)) (INTJ uh) , (NP-SBJ no one) (VP wants (RS ]) (NP it) (ADVP at all))) (NP-SBJ *) (VP seems (SBAR like (S *T*-1))). E_S)) HP Labs Bangalore, 8/21/2003 47
Switchboard: AG HP Labs Bangalore, 8/21/2003 Penn 48
Another multiple annotation Penn It is quite realistic to have this many diverse annotations (and more!) for the same material. . . HP Labs Bangalore, 8/21/2003 49
AG formalization: Background Penn Annotation - the basic action: - associate a label with an extent of signal labels may be of different types may span different amounts of time; need not form a hierarchy Minimal formalization: - directed graph typed, fielded records on the arcs optional time references on the nodes HP Labs Bangalore, 8/21/2003 50
Timelines Penn Nodes are anchored to signals using offsets An annotation may reference more than one signal - e. g. simultaneous audio and video signals from multiple microphones audio and physiological signals All the signals covered by a given annotation must be from the same "flow of time" = timeline T but signals may cover a timeline only partially (Other ordered sets, such as the sequence of characters in a text, may also be treated as timelines. . . ) HP Labs Bangalore, 8/21/2003 51
Two Signals, One Timeline Penn (Could be treated as a single multi-channel signal -but different channels might be in different files, have different frame rates, etc. ) HP Labs Bangalore, 8/21/2003 52
AG: Formal Definition Penn An Annotation Graph G over a label set L and timeline T is a 3 -tuple <N, A, t>: - N = set of nodes A = set of arcs labelled with elements of L t = partial function from N to T satisfying the following conditions: 1 2 <N, A> is acyclic, with no nodes of degree zero for any path from node n 1 to n 2, if t(n 1) and t(n 2) are defined, then t(n 1) <= t(n 2) HP Labs Bangalore, 8/21/2003 53
Condition 1 Penn 1. <N, A> is acyclic, with no nodes of degree zero 1 a. AGs are acyclic - expresses the linearity of signal annotations an important property wrt implementations and to QLs containing path expressions 1 b. AGs have no orphan nodes - the only point of nodes is to anchor the arcs avoids the situation of AGs that are identical but for orphan nodes HP Labs Bangalore, 8/21/2003 54
Penn Condition 2 for any path from node n 1 to n 2, if t(n 1) and t(n 2) are defined, then t(n 1) <= t(n 2) 2. AGs respect the flow of time (or the structure of another anchoring space) 1 1. 23 1 2 HP Labs Bangalore, 8/21/2003 1 2 12 3. 15 55
AG: Interpretation of Labels Penn Arc labels may be interpreted as: - substantive content conforming to a coding practice as meta-commentary as a reference to other material as an identifier as arbitrary binary data Choice of label interpretations falls outside the scope of the formalism HP Labs Bangalore, 8/21/2003 56
AG: Expressiveness Penn Is the formalism too minimalist? Some things that some people want: 1. cross-reference from a label 2. 3. 4. 5. to another arbitrary label, arc or node labels as well as anchors for nodes anchoring nodes to arcs or labels rather than timelines anchoring arcs/labels in 2 - or 3 -dimensional spaces recursive structures in labels “Core AG” has sufficient expressive capacity to encode, in an intuitive way, all commonly used formats, and also good properties wrt creation, maintenance, search Our strategy: - see how far we can go with this core - dispense with more complex syntax and focus on semantics - but some of (1) has been added in core AG implementation, and (4) has been added in “ATLAS” (NIST version) HP Labs Bangalore, 8/21/2003 57
Structures for a single layer Penn All of these have (one or more) natural representations in the basic AG formalism. Multiple layers can of course be added in a general way. HP Labs Bangalore, 8/21/2003 58
Equivalence classes Penn Equivalence classes (joint reference to an external ID) provide a way to establish symmetrical inter-label linkages without any new formal devices HP Labs Bangalore, 8/21/2003 59
AG as algebra Penn · An AG can be represented as a set of arcs each with an associated label and (optionally-anchored) source and destination nodes · The power set of this arc set defines a boolean algebra (as usual) · Every member of the power set is itself a well-defined AG · This algebra can be used for queries, just as the relational algebra is for RDBs · Adding e. g. pointers from labels to other arc compromises this property (because arc subsets are not well-formed if pointers cannot be dereferenced) HP Labs Bangalore, 8/21/2003 60
AG as RDB Penn · An AG can therefore also be interpreted as a relational table · or (more conveniently) as a set of three relational tables · This allows standard RDB implementations to be used for AG storage and retrieval · Obvious advantages, though standard RDB may not use AG structure optimally. . . HP Labs Bangalore, 8/21/2003 61
Penn Relational Representation a 1 t 1 Ann 1: <l 1, l 2, . . . , ln> a 2 t 2 Three relations: - anchor, annotation (=arc), feature (=label) HP Labs Bangalore, 8/21/2003 62
Penn Anchor Relation a 1 t 1 Ann 1: <l 1, l 2, . . . , ln> a 2 t 2 Anchor. Id Offset a 1 t 1 a 2 t 2 HP Labs Bangalore, 8/21/2003 63
Penn Annotation (arc) Relation a 1 t 1 Ann 1: <l 1, l 2, . . . , ln> Annotation. Id Ann 1 HP Labs Bangalore, 8/21/2003 Source a 1 a 2 t 2 Destination a 2 64
Penn Feature Relation a 1 t 1 Ann 1: <l 1, l 2, . . . , ln> a 2 t 2 Annotation. Id Feature Value Ann 1 F 1 l 1 Ann 1 F 2 l 2. . HP Labs Bangalore, 8/21/2003 65
Queries across multiple tables ID Sex DR Ht ha /hh aa 1/ AKS 0 F 1 5'04" habit /hh ae 1 b ix t/ ASW 0 F 5 5'06" had /hh ae 1 d/ BJL 0 F 5 5'07" hafta /hh ae 1 f t ax/ Penn train/dr 2/fbjl 0/ HP Labs Bangalore, 8/21/2003 66
Queries on AG Tables Penn select * from FEATURE where FEATURE. AGID="Timit: AG 80" select ANNOTATIONID, SPKRINFO. ID from FEATURE, SPKRINFO where SPKRINFO. DR=1 and SPKRINFO. Ht=70 and FEATURE. VALUE="dark" HP Labs Bangalore, 8/21/2003 67
AG software Penn · AGTK - provides API - and language bindings - version 2. 0 recently released · Sample applications · Open-source license · Available on sourceforge: HP Labs Bangalore, 8/21/2003 68
AGTK architecture HP Labs Bangalore, 8/21/2003 Penn 69
API Summary · · Penn Functions for creating, accessing, modifying, storing and loading AGs C++ library Compiles on Unix and Windows Scripting language access: Python, Tcl/tk HP Labs Bangalore, 8/21/2003 70
File I/O Library Penn Approach: - build import methods for all widely used formats public API & documentation to encourage others to contribute code for their formats Currently supported: - AIF (ATLAS Interchange Format - XML) BAS, BU, CALLHOME, CSV, Switchboard, TIMIT, Treebank, xlabel HP Labs Bangalore, 8/21/2003 71
Integration with other tools Penn Example: Wave. Surfer/SNACK Sjölander and Beskow www. speech. kth. se/wavesurfer/ - - open source software for sound visualization, analysis and manipulation Linux, Windows 95/98/NT/2 k, Mac, Solaris, . . . customizable, extensible, embeddable can read and write: § wav, au, aiff, mp 3, csl, sd, sphere § unlimited file size - Unicode support HP Labs Bangalore, 8/21/2003 72
Wavesurfer Screenshot 1 HP Labs Bangalore, 8/21/2003 Penn 73
Wavesurfer Screenshot 2 HP Labs Bangalore, 8/21/2003 Penn 74
Wavesurfer Screenshot 3 HP Labs Bangalore, 8/21/2003 Penn 75
Wavesurfer Screenshot 4 HP Labs Bangalore, 8/21/2003 Penn 76
Annotation Component: Spreadsheet Penn (TRAINS+DAMSL) Annotation here presented in spreadsheet mode Each row is an annotation of stretch of signal Each column is a type of annotation HP Labs Bangalore, 8/21/2003 77
Table. Trans tool Penn Seamless integration of AGTK for annotation, and Wavesurfer for audio display and playback. HP Labs Bangalore, 8/21/2003 78
Components in Table. Trans HP Labs Bangalore, 8/21/2003 Penn 79
Another annotation GUI HP Labs Bangalore, 8/21/2003 Penn 80
Issues for the future Penn Some positive things - “stand-off” (rather than in-line) annotation § is now common though by no means universal § but in-line annotators mostly realize they are sinful - AGTK implementation is mature § libraries are well designed & implemented § good integration with GUIs and DB backends § can read/write many common formats - Some AG-based tools are good § basically, those that have really been used § demand pull & influence of users on development HP Labs Bangalore, 8/21/2003 81
Issues for the future Penn Some things need more work - AG API and AGTK are not yet widely used - Many AG-based tools are rough sketches - NIST ATLAS is not popular with researchers (java, complexity) - For many projects, something simpler & less general is still the local optimum: § lines of tab-separated fields, or § in-line mark-up (XML or ad hoc), or § other legacy or new ad hoc formats but it’s still early days. . . HP Labs Bangalore, 8/21/2003 82
- Slides: 82