SGML Standard Generalized Markup Language SGML Challenges Electronic















































- Slides: 47
SGML Standard Generalized Markup Language SGML
Challenges Electronic delivery n Interchange n Maintainability n Cycle time reduction n Customization n SGML 2
Markup n Markup – Writing instructions on a physical page to the typesetter regarding how the various parts of the document should be typeset n Generalized markup – Separate document markup from any specific use of the document SGML 3
SGML n SGML Formalized methodology for developing and using document markup that supports the separation of form from content and from specific processing dependencies 4
Why were databases invented? Physical and logical data independence n Data became a resource to be shared n Data should outlive the processes which use it n SGML 5
Conventional Publishing n Didn’t exploit new technology – Proprietary data formats – Same document to be used by many different browsers Merging document content from multiple sources n Didn’t reuse or share document content n SGML 6
Document Structures n Document types – Elements – Database types – Makes possible very powerful information manipulation SGML 7
Document Structures n Newsletter – Banner Name n Volume n Edition n Date n Publisher n Editorial board n Masthead n SGML 8
Document Structures n Newsletter – Newsletter body Contents n Article + n SGML – Title – Author – Article body n Section + – Sidebar + n Sidebar heading n Sidebar body 9
Document Structures n Cookbook – Front matter n Publisher data – Title – Author – ISBN – Publishing history Foreword n Acknowledgments n Table of contents n SGML 10
Document Structures n Cookbook – Body n General information – Section + n Recipe category + – Recipe subcategory + n Recipe + SGML 11
Document Structures n Cookbook – Back matter n Appendix + – Section + n Glossary – Term, definition + n SGML Index 12
Document Structures n Recipe – Name – Description – Introduction SGML 13
Document Structures n Recipe – Ingredients n Ingredient + – Quantity – Unit n Procedure – Step + Portions n Nutrition data n SGML 14
Common Elements Headings n Tables n Enumerated lists n SGML 15
Queries n SGML Locate all recipes having chicken as an ingredient 16
Entities n SGML Independently stored portions of a document which can be independently manipulated and used by name 17
Attributes of Elements n ID – Unique identifier n IDREF – ID reference SGML 18
Rules n Options to be associated with table of contents for a given document type – Actual data of table of contents generated during processing time and not supplied by author in source document SGML 19
Table-of-Contents n Case 1 - Document must have table of contents and it must be in a fixed position in formatted document – Not necessary to identify this element or to specify rule for it n SGML The processing program produces it and places it correctly 20
Table-of-Contents n Case 2 - Table of contents is optional, but if it occurs, it must be in a particular place – All that is necessary is an indication of whether or not table of contents is to be included Can be done as an attribute of another element, say the front matter n Can be done by specifying an optional element in a fixed position n SGML 21
Table-of-Contents n SGML Case 3 - Document must have table of contents, but it can be located in one of several places – Can define element type with no structure, as its content is supplied by processing program, and have rules which specify that it must occur in one of several places – Can specify as an attribute of another entity and allow it to have a range of values that tell where it is located 22
Table-of-Contents n Case 4 - Table of contents is optional, as well as where it is placed – Element type defined that can occur in multiple places SGML 23
Parts of an SGML document n SGML declaration – Means by which it is made known which options a document is going to use – Usually a single declaration is used for all documents under a particular system n Prolog – Usually a single document type declaration (DTD) – Contains rules to which any document of a given type must conform SGML 24
Parts of an SGML document n Document instance – The document itself, marked up following the SGML usage conventions specified in the SGML declaration and the document type definition specified in the DTD SGML 25
Newsletter DTD 1 <!DOCTYPE newsletter [ … SGML 2 <!ELEMENT newsletter - - (admin-sect, body) 3 <!ATTLIST newsletter newslname CDATA #REQUIRED volume NUMBER #REQUIRED edition NUMBER #REQUIRED date CDATA #REQUIRED> 4 <!ELEMENT admin-sect - - (masthead, ed-board)> 5 <!ELEMENT masthead - - (%text; )> 6 <!ELEMENT ed-board - - (%text; )> 7 <!ELEMENT body - - (article+)> 8 <!ELEMENT article - - (title, author? , article-body, sidebar+)> 9 <!ELEMENT title - - (#PCDATA)> 26
Newsletter DTD 10 11 12 13 <!ELEMENT author - - (#PCDATA)> article-body - - (section+)> section - - (title? , %text; )> sidebar - - (title, %text; )> … ]> n All items – Double hyphens n SGML Start tag and end tag are mandatory 27
Newsletter DTD n Item 1 – Start of DTD – Identifies document type as a newsletter – Skipped items we’ll get to later SGML 28
Newsletter DTD n Item 2 – Defines highest level element that will occur in document instance – Its generic identifier is “newsletter” – Items in parenthesis is the content model n SGML Defines a newsletter as containing only an administrative section (admin-sect) and a body 29
Newsletter DTD n Item 3 – Defines attributes of newsletter element type – Descriptive properties of newsletter – CDATA is character data – Consider <!ATTLIST newsletter #REQUIRED> n SGML newslname (health | felines | antiques) Actual text of name is supplied by processing program 30
Newsletter DTD n Item 5 – %text Parameter entity n Refers to another definition n n Item 7 –+ n Occurrence indicator – Others are *, ? SGML 31
Newsletter DTD n Item 9 – #PCDATA Parsed character data n String of characters of any length n SGML 32
Recipe DTD 1 <!DOCTYPE … 2 <!ELEMENT matter)> 3 <!ELEMENT dedication toc)> 4 <!ELEMENT publ 5 <!ELEMENT 6 <!ELEMENT 7 <!ELEMENT 8 <!ELEMENT 9 <!ELEMENT 10 <!ELEMENT SGML cookbook [ cookbook - - (front-matter, body, back- front-matter - - (publish-data, (foreword & &acknowledg), - - (title, subtitle? , author+, isbn, history, copyright-yr)> title - - (#PCDATA)> subtitle - - (#PCDATA)> author - - (#PCDATA)> isbn - - (#PCDATA)> publ-history - - (#PCDATA)> copyright-yr - - (#PCDATA)> publish-data 33
Recipe DTD 11 12 13 14 15 16 17 18 <!ELEMENT <!ELEMENT 19 <!ELEMENT 20 <!ELEMENT wrapup? )> 21 <!ATTLIST SGML foreword - dedication acknowledg toc -O body -gen-info - section - recipe-cat (title, %text; )> - - (#PCDATA)> - - (title, %text; )> EMPTY (gen-info, recipe-cat+)> (section+)> (title, %text; ? , section*)> - - (section? , (recipe-subcat+ | megarecipe+))> recipe-subcat - - (section? , megarecipe+)> megrecipe - - (title, recipe-intro, recipe+, megrecipe nutrition preptime CDATA servings NUMBER #IMPLIED NMTOKEN #IMPLIED> 34
Recipe DTD 22 <!ELEMENT 23 <!ELEMENT 24 <!ELEMENT 25 <!ELEMENT 26 <!ATTLIST 27 <!ELEMENT 28 <!ELEMENT 29 <!ELEMENT ]> SGML recipe-intro - - (%text; )> recipe - - (title? , recipe-intro? , ingredients, procedure, wrapup? )> ingredients - - (ingredient+)> ingredient - - (#PCDATA)> ingredient metricunit CDATA #IMPLIED metricquan NUMBER #IMPLIED> procedure - - (step+)> step - - (%text; )> wrapup - - (%text; )> 35
Recipe DTD n Item 3 –& n Connector – Others are | and , n n Any order Item 14 –O n SGML Can omit particular tag, in this case an end tag 36
Recipe DTD n Item 17 – Recursion n Item 22 – NMTOKEN Name token n Processing program should find value n – #IMPLIED Value doesn’t have to be given n Processing program chooses a default n SGML 37
Document Instance element <name attname="attvalue">content</name> start-tag SGML content end-tag 38
Document Instance <newsletter newslname=felines volume=3 edition=6 date="18 December 1995"> <admin-sect> <masthead> <para>Published by the Aco Company, … </para> </masthead> <ed-board> <para>W. I. Grosky, … </para></ed-board></masthead> SGML 39
Document Instance <body> <article> <title>What I did on my Summer Vacation</title> <author>John Smith</author> <article-body> <section> <para>I like the seashore…</para> <para>I like the mountains even better… </para></section> SGML 40
Document Instance <section> <title>Where I will go Next Year</title> <para>Europe is always nice in the summer…</para></section></articlebody> <sidebar> <title>Skiing in the Alps</title> <para>Oops…</para></sidebar></article> <article>…</article></body></newsletter> SGML 41
SGML Declaration n How a document uses SGML – Character set – Concrete syntax Delimiters n Name characters and lengths n – Optional features SGML 42
SGML System processing program SGML parser entry/edit composition etc. entity manager SGML declaration document instance program output SGML 43
Parser Read SGML declaration and establish rules for interpreting the rest of the document and applying SGML features n Read DOCTYPE declaration, verifying that it follows rules for declaring DTD's, and determine rules for interpreting document instance n SGML 44
Parser Read document instance and validate that it follows rules of DTD n Resolve entity references n SGML 45
The Web and Hypermedia n HTML – Hypertext Markup Language – DTD of SGML n Hy. Time – Hypermedia/Time-based Structuring Language – DTD of SGML 46
Part of HTML DTD <!-========Images============ --> <!ELEMENT IMG -O EMPTY> <!ATTLIST IMG SRC CDATA #REQUIRED ALT CDATA #IMPLIED ALIGN (top | middle | bottom) #IMPLIED ISMAP (ISMAP) #IMPLIED> <!- <IMG> Image; icon, glyph or illustration - -> <!- <IMG SRC="…"> Address of image object - -> <!- <IMG ALT="…"> Textual alternative - -> <!- <IMG ALIGN="…"> Position relative to text - -> <!- <IMG ISMAP> Each pixel can be a link - -> SGML 47