TEI Lite a sample application TEI Lite w

  • Slides: 45
Download presentation
TEI Lite: a sample application

TEI Lite: a sample application

TEI Lite w one of many possible views of the TEI dtd w what

TEI Lite w one of many possible views of the TEI dtd w what do most people want, most of the time? w realistic for existing texts, e. g. OTA, Virginia w realistic for document production, e. g. TEI technical documentation w see http: //www. tei-c. org/Lite/

Basic structure(s) w A TEI-conformant document comprises a header followed by a text w

Basic structure(s) w A TEI-conformant document comprises a header followed by a text w the header contains: n n mandatory file description optional encoding, profile and revision descriptions w the header is essential for: n n bibliographic control and identification resource documentation and processing

Structure of a TEI text w A text may be unitary or composite w

Structure of a TEI text w A text may be unitary or composite w a unitary text contains n n n front matter back matter a body w in a composite text, the body is a group of texts (or nested groups)

TEI basic structure s tei. Corpus. 2 tei. Header TEI. 2 tei. 2 text

TEI basic structure s tei. Corpus. 2 tei. Header TEI. 2 tei. 2 text tei. Header front group body back div text front div body back

TEI Structures Summarized w tei 2 : : tei. Header text w text :

TEI Structures Summarized w tei 2 : : tei. Header text w text : : front? (body|group) back? w group : : (text|group)+ w tei. Corpus : : tei. Header tei 2+

A text usually has divisions w generic, hierarchic subdivisions w vanilla or numbered w

A text usually has divisions w generic, hierarchic subdivisions w vanilla or numbered w type attribute w associated head and trailer elements from the divtop class

for example. . . <text> <front> <!-- titlepage, etc here --> </front> <body> <div

for example. . . <text> <front> <!-- titlepage, etc here --> </front> <body> <div 1 type='book' n='I' id=JA 0100> <head>Book I. </head> <div 2 type='chapter' n='1' id=JA 0101> <head>Of writing lives in general, . . . <!-- remainder of chapter 1 here --> </div 2> <div 2 n='2' id=JA 0102> <!-- chapter 2 here --> </div 2> <!-- remainder of book 1 here --> </div 1> <div 1 type='book' n='II' id=JA 0200> <!-- book 2 here --> </div 1> <!-- remaining books here --> </body></text>

Use of global attributes w Applicable to all elements n n id for unique

Use of global attributes w Applicable to all elements n n id for unique identification n for (non-unique) name or number rend for rendition (appearance) lang for language and hence writing-system w Extended by some tagsets (e. g. linking, analysis)

Character Encoding Recommendations w non normative w extend, using standard entity sets or transliteration

Character Encoding Recommendations w non normative w extend, using standard entity sets or transliteration w document transliteration scheme with formal Writing System Declaration a A 0 " b B 1 % c C 2 & d D 3 ' e E 4 ( f F 5 ) g G 6 * h H 7 + i I 8 , j k l m n o p q r s t u v w x y z J K L M N O P Q R S T U V W X Y Z 9 -. / : ; < = > ? _ (space)

Text components in TEI Lite w What are divisions composed of? n n n

Text components in TEI Lite w What are divisions composed of? n n n prose is mostly paragraphs (<p>) verse is mostly lines (<l>), sometimes in hierarchic groups (<lg>) drama is mostly speeches (<sp>) containing <p> or <l> and interspersed with stage directions (<stage>) w These may be mixed, and may also appear directly within undivided texts.

Verse: an example <lg type='haiku'> <l>Summer grass — </l> <l>all that's left</l> <l>of warriors'

Verse: an example <lg type='haiku'> <l>Summer grass — </l> <l>all that's left</l> <l>of warriors' dreams. </l> </lg>

Drama: an example <stage>Enter Barnardo and Francisco, two Sentinels, twoat. Sentinels, at several doors</stage>

Drama: an example <stage>Enter Barnardo and Francisco, two Sentinels, twoat. Sentinels, at several doors</stage> several doors <sp who='Barnardo'><l>Who's there? Barnardo: Who's there? </l></sp> Francisco: Nay, answer me. Stand unfold <spyourself. who='Francisco'><l>Nay, answer me. Stand unfold yourself. </l></sp> Barnardo: Long live the king! <sp who='Barnardo'><l>Long live the Francisco: Barnardo? king! </l></sp> Barnardo: He. <sp who='Francisco'> <l>Barnardo? </l></sp> <sp who='Barnardo'><l>He. </l></sp>

Texts are not just words. . . w … but probably only people know

Texts are not just words. . . w … but probably only people know that w an encoding may claim to capture n n n just visual salience, just its assumed causes both w encoding makes explicit one (or more) sets of interpretations

For example. . . And this Indenture further witnesseth that the said Walter Shandy,

For example. . . And this Indenture further witnesseth that the said Walter Shandy, merchant, in consideration of the said intended marriage. . . <hi rend='gothic'>And this Indenture further witnesseth</hi> that the said <hi rend='italic'>Walter Shandy</hi>, merchant, in consideration of the said intended marriage. . .

…or. . . And this Indenture further witnesseth that the said Walter Shandy, merchant,

…or. . . And this Indenture further witnesseth that the said Walter Shandy, merchant, in consideration of the said intended marriage. . . <seg type='formula'>And this Indenture further witnesseth</seg> that the said <name rend='italic'>Walter Shandy</name>, merchant, in consideration of the said intended marriage. . .

Who does the work? w TEI scheme allows for close reading -- and the

Who does the work? w TEI scheme allows for close reading -- and the reverse w can tag very detailed features of discourse function w can normalise or simplify (e. g. dates numbers, names) w … or leave well alone

Phrase level elements include. . . w phrases that are conventionally typographically distinct w

Phrase level elements include. . . w phrases that are conventionally typographically distinct w “data-like” (names, numbers, dates, times, addresses) w editorial intervention (corrections, regularizations, additions, omissions. . . ) w cross references and links (see later)

for example. . . <head>Of writing lives in general, and particularly of <title>Pamela </title>,

for example. . . <head>Of writing lives in general, and particularly of <title>Pamela </title>, with a word by the bye of <name>Colley Cibber</name> and others. </head> <p>It is a trite but true observation, that <q>examples work more forcibly on the mind than precepts</q>. … <p><name>Mr. Joseph Andrews</name>, <rs>the hero of our ensuing history</rs>, was esteemed to be. . .

Direct speech w Use the who attribute to show speakers w Speeches can be

Direct speech w Use the who attribute to show speakers w Speeches can be nested in other speeches <q who='Wilson'>Spaulding, he came down into the office just this day eight weeks with this very paper in his hand, and he says: — <q who='Spaulding'>I wish to the Lord, Mr. Wilson, that I was a red-headed man. </q>

Foreign language phrases w The lang attribute may be attached to any element w

Foreign language phrases w The lang attribute may be attached to any element w Use <foreign> if nothing else is available w Define each language in <lang. Usage> in header Have you read <title lang='deu'>Die Dreigroschenoper </title>? <mentioned lang='fra'>Savoir-faire</mentioned> is French for know-how. John has real <foreign lang='fra'>savoirfaire</foreign>.

Names and other referring strings w The <rs> (referring string) element is used for

Names and other referring strings w The <rs> (referring string) element is used for any kind of name or reference <q>My dear <rs type='person' key='BENM 1'>Mr. Bennet</rs>, </q> said <rs type='person' key='BENM 2'> his lady</rs> to him one day, <q>have you heard that <rs type='place' key='NETP 1'> Netherfield Park</rs> is let at last? </q>

Dates, times, numbers w attributes can be used to quantify <date> and <date. Range>

Dates, times, numbers w attributes can be used to quantify <date> and <date. Range> expressions w similarly, times <time>, <time. Range> and numbers <num> Today is <date>Tuesday 29 th</date>. Today is <date value='1994 -11 -29'>Tuesday 29 th </date>. One afternoon in <date certainty='approx' value='1994 -11'>late November. </date>. One afternoon in <date. Range from='1994 -11 -15' to='1994 -11 -30 exact='to'> late November. </date. Range>.

Correction and Regularization w <corr> and <sic> for correction (or non-correction) w <reg> and

Correction and Regularization w <corr> and <sic> for correction (or non-correction) w <reg> and <orig> for normalization (or the reverse). . . for his nose was as sharp as a pen and <reg sic="a'">he</reg> for his nose was as sharp as a pen and <corr. . orig='table' ed=Gifford> a’ table of green feelds. babbl'd</corr> of green <reg sic='feelds'>fields</reg>

Omissions, Deletions, Additions w <gap> omission by transcriber w <del> cancellation in source or

Omissions, Deletions, Additions w <gap> omission by transcriber w <del> cancellation in source or by editor w <add> or <supplied> insertion in source or by editor w <unclear> material uncertain because illegible w <damage> physical damage to text carrier

The multiple hierarchy problem w SGML allows only one hierarchy at a time w

The multiple hierarchy problem w SGML allows only one hierarchy at a time w Is a document n n n chapter-paragraph-phrase gathering-page-leaf or both? w discontinuous segments w links and milestones

Boundary markers w page, column, and line breaks (<pb>, <cb>, <lb>) w generic <mile.

Boundary markers w page, column, and line breaks (<pb>, <cb>, <lb>) w generic <mile. Stone> Diana and <pb ed='ED 1' n='475'> Mary approved the step unreservedly. Dia<pb ed='ED 2' n='483'>na announced that. . .

Some chunks are also phrases w <list> lists of all kinds w <note> notes

Some chunks are also phrases w <list> lists of all kinds w <note> notes (authorial or editorial) w <figure> pictures or figures w <formula> formulae w <table> tables w <bibl> bibliographic descriptions

Lists w use <list> for lists of any kind (use type attribute to distinguish)

Lists w use <list> for lists of any kind (use type attribute to distinguish) w use <label> in two-column lists as alternative to n attribute w may be nested as necessary

for example. . . <list type=“xmas”> <label>For my true love</label> For type=“bullets”> my true

for example. . . <list type=“xmas”> <label>For my true love</label> For type=“bullets”> my true love: <item><list <item>three calling * three callingbirds></item> birds <item>two* two french hens</item> french hens <item>a partridge in a pear tree<item> * a partridge in a pear tree </list></item> <label>For For. Uncle. Joe</label> Joe: <item>socks asasusual</item> socks usual </list>

Figures and graphics w The presence of a graphic is indicated by the <figure>

Figures and graphics w The presence of a graphic is indicated by the <figure> element w The title of the graphic is tagged as a <head> w A description of the graphic may be supplied (as a <fig. Desc>) for use by software unable to render the graphic w The graphic itself is specified as an external entity

for example. . . <!ENTITY fezzi. Pic SYSTEM "fezz. gif" NDATA GIF> <figure entity=‘fezzi.

for example. . . <!ENTITY fezzi. Pic SYSTEM "fezz. gif" NDATA GIF> <figure entity=‘fezzi. Pic’> <head>Mr Fezziwig's Ball</head> <figdesc>A Cruikshank engraving showing Mr Fezziwig leading a group of revellers. </figdesc></figure>

Tables w a <table> element contains <row>s of <cell>s w spanning is indicated by

Tables w a <table> element contains <row>s of <cell>s w spanning is indicated by rows and cols attributes w role attribute indicates whether row or column holds data or a label w embedded tables are permitted

for example. . . A three column table Row 1 123 Row 2 abc

for example. . . A three column table Row 1 123 Row 2 abc 4567 defgh <table> <row cols=‘ 3’><cell role=‘label’>A three column table </cell></row> <row><cell role=‘label’>Row 1</cell><cell>123</cell> <cell>4567</cell></row> <row><cell role=‘label’>Row 2</cell><cell>abc</cell> <cell>defgh</cell></row> </table>

Bibliography w Use simple <bibl> with optional subcomponents: n n <resp. Stmt> (for any

Bibliography w Use simple <bibl> with optional subcomponents: n n <resp. Stmt> (for any kind of responsibility) or <author>, <editor>, etc. <title> with optional level attribute <imprint> groups publication details <bibl. Scope> adds page references etc. w Use <list. Bibl> for list of references

for example. . . <p>See for example <ref target=‘REG 92’>Regis (1992)</ref>. . <div><head>Bibliography</head> <list.

for example. . . <p>See for example <ref target=‘REG 92’>Regis (1992)</ref>. . <div><head>Bibliography</head> <list. Bibl> <bibl id=‘REG 92’> <author>Ed Regis</author> <title level=m>Great Mambo Chicken and the Trans. Human Experience</title> <pub. Place>London </pub. Place> <publisher>Penguin Books</publisher> <date>1992</date> <biblscope>pp 144 ff</biblscope></bibl> </list. Bibl></div>

Notes w Use <note> for notes of any kind (editorial or authorial) w if

Notes w Use <note> for notes of any kind (editorial or authorial) w if in-line, use place attribute to specify location w if out of line, either n n use target attribute to specify attachment point or mark attachment point as a <ref>

for example. . . <lg> <l>The self-same moment I could pray></l> <l>And from my

for example. . . <lg> <l>The self-same moment I could pray></l> <l>And from my neck so free</l> <l>The albatross fell off, and sank</l> <l id=“L 213”>Like lead into the sea. <note type=”auth” place=“margin”> The spell begins to break. </note> </lg>

Cross References w Use <ptr> (empty element) or <ref> (with content) w use target

Cross References w Use <ptr> (empty element) or <ref> (with content) w use target to specify an identifier (ID value) See especially <ref target='SEC 12'> section 12 on page 34</ref>. See especially <ptr target='SEC 12'>. . <div id='sec 12'> <head>Concerning Identifiers</head> But what if the target is not in the current document?

HTML-style pointers in TEI w new URL attribute. . . as described on <xref

HTML-style pointers in TEI w new URL attribute. . . as described on <xref url=“http: //www. tei-c. org”> the TEI website</xref>. . .

TEI X-pointers w “Hy. Time for Idiots” w location ladder w 15 location types

TEI X-pointers w “Hy. Time for Idiots” w location ladder w 15 location types w both SGML and non-SGML based <xptr doc=“xdoc” from='step 1 step 2 step 3. . . '> to be adapted for consistency with XLink

A three way alignment <div id=E 98 lang=EN><head>The Study</head> <seg id=E 9801>The Study</seg> <seg

A three way alignment <div id=E 98 lang=EN><head>The Study</head> <seg id=E 9801>The Study</seg> <seg id=E 9802>is a place</seg> <seg id=E 9803>where a Student, </seg> <div id=L 98 lang=LA> <seg id=E 9804>a part from men, </seg> <head>Musé um</head> <seg id=E 9805>sitteth alone, </seg> <seg id=L 9801>Museum</seg> <seg id=E 9806>addicted to his Studies, </seg> <seg id=L 9802>est locus</seg> <seg id=E 9807>whilst he. Studiosus, </seg> readeth</seg> <xptr n='1' id=p 981 doc=com 98> <seg id=L 9803>ubi <seg id=E 9808>Books, </seg> <xptr n='2' id=p 982 doc=com 98 <seg id=L 9804>secretus ab hominibus, from='space (2 d) (75 5) (133 75)'> <seg id=L 9805>studiis deditus, </seg> <xptr id=p 983 lectitat</seg> doc=com 98 <link. Grp type=alignment> <seg n='3' id=L 9806>dum from='space (2 d) (55 L 9801 42) (90 60)'> <link targets='E 9801 p 981'> <link targets='E 9802 <link targets='E 9803 <link targets='E 9804 <link targets='E 9805 <link targets='E 9808 </link. Grp> L 9802 '> L 9803 p 982'> L 9804 '> L 9805 '> L 9808 p 983'>

Not covered here. . . w specialised front and back matter w analytic tagging

Not covered here. . . w specialised front and back matter w analytic tagging n n segmentation interpretations w the header w tags for documentation

Summary w How TEI Lite handles… n n Structural divisions Rendition vs. interpretation Phrases,

Summary w How TEI Lite handles… n n Structural divisions Rendition vs. interpretation Phrases, chunks, and chunky phrases Pointers and links w Any dtd dealing with ordinary text will need a similar range