SGML Standard Generalized Markup Language SGML Challenges Electronic

  • Slides: 47
Download presentation
SGML Standard Generalized Markup Language SGML

SGML Standard Generalized Markup Language SGML

Challenges Electronic delivery n Interchange n Maintainability n Cycle time reduction n Customization n

Challenges Electronic delivery n Interchange n Maintainability n Cycle time reduction n Customization n SGML 2

Markup n Markup – Writing instructions on a physical page to the typesetter regarding

Markup n Markup – Writing instructions on a physical page to the typesetter regarding how the various parts of the document should be typeset n Generalized markup – Separate document markup from any specific use of the document SGML 3

SGML n SGML Formalized methodology for developing and using document markup that supports the

SGML n SGML Formalized methodology for developing and using document markup that supports the separation of form from content and from specific processing dependencies 4

Why were databases invented? Physical and logical data independence n Data became a resource

Why were databases invented? Physical and logical data independence n Data became a resource to be shared n Data should outlive the processes which use it n SGML 5

Conventional Publishing n Didn’t exploit new technology – Proprietary data formats – Same document

Conventional Publishing n Didn’t exploit new technology – Proprietary data formats – Same document to be used by many different browsers Merging document content from multiple sources n Didn’t reuse or share document content n SGML 6

Document Structures n Document types – Elements – Database types – Makes possible very

Document Structures n Document types – Elements – Database types – Makes possible very powerful information manipulation SGML 7

Document Structures n Newsletter – Banner Name n Volume n Edition n Date n

Document Structures n Newsletter – Banner Name n Volume n Edition n Date n Publisher n Editorial board n Masthead n SGML 8

Document Structures n Newsletter – Newsletter body Contents n Article + n SGML –

Document Structures n Newsletter – Newsletter body Contents n Article + n SGML – Title – Author – Article body n Section + – Sidebar + n Sidebar heading n Sidebar body 9

Document Structures n Cookbook – Front matter n Publisher data – Title – Author

Document Structures n Cookbook – Front matter n Publisher data – Title – Author – ISBN – Publishing history Foreword n Acknowledgments n Table of contents n SGML 10

Document Structures n Cookbook – Body n General information – Section + n Recipe

Document Structures n Cookbook – Body n General information – Section + n Recipe category + – Recipe subcategory + n Recipe + SGML 11

Document Structures n Cookbook – Back matter n Appendix + – Section + n

Document Structures n Cookbook – Back matter n Appendix + – Section + n Glossary – Term, definition + n SGML Index 12

Document Structures n Recipe – Name – Description – Introduction SGML 13

Document Structures n Recipe – Name – Description – Introduction SGML 13

Document Structures n Recipe – Ingredients n Ingredient + – Quantity – Unit n

Document Structures n Recipe – Ingredients n Ingredient + – Quantity – Unit n Procedure – Step + Portions n Nutrition data n SGML 14

Common Elements Headings n Tables n Enumerated lists n SGML 15

Common Elements Headings n Tables n Enumerated lists n SGML 15

Queries n SGML Locate all recipes having chicken as an ingredient 16

Queries n SGML Locate all recipes having chicken as an ingredient 16

Entities n SGML Independently stored portions of a document which can be independently manipulated

Entities n SGML Independently stored portions of a document which can be independently manipulated and used by name 17

Attributes of Elements n ID – Unique identifier n IDREF – ID reference SGML

Attributes of Elements n ID – Unique identifier n IDREF – ID reference SGML 18

Rules n Options to be associated with table of contents for a given document

Rules n Options to be associated with table of contents for a given document type – Actual data of table of contents generated during processing time and not supplied by author in source document SGML 19

Table-of-Contents n Case 1 - Document must have table of contents and it must

Table-of-Contents n Case 1 - Document must have table of contents and it must be in a fixed position in formatted document – Not necessary to identify this element or to specify rule for it n SGML The processing program produces it and places it correctly 20

Table-of-Contents n Case 2 - Table of contents is optional, but if it occurs,

Table-of-Contents n Case 2 - Table of contents is optional, but if it occurs, it must be in a particular place – All that is necessary is an indication of whether or not table of contents is to be included Can be done as an attribute of another element, say the front matter n Can be done by specifying an optional element in a fixed position n SGML 21

Table-of-Contents n SGML Case 3 - Document must have table of contents, but it

Table-of-Contents n SGML Case 3 - Document must have table of contents, but it can be located in one of several places – Can define element type with no structure, as its content is supplied by processing program, and have rules which specify that it must occur in one of several places – Can specify as an attribute of another entity and allow it to have a range of values that tell where it is located 22

Table-of-Contents n Case 4 - Table of contents is optional, as well as where

Table-of-Contents n Case 4 - Table of contents is optional, as well as where it is placed – Element type defined that can occur in multiple places SGML 23

Parts of an SGML document n SGML declaration – Means by which it is

Parts of an SGML document n SGML declaration – Means by which it is made known which options a document is going to use – Usually a single declaration is used for all documents under a particular system n Prolog – Usually a single document type declaration (DTD) – Contains rules to which any document of a given type must conform SGML 24

Parts of an SGML document n Document instance – The document itself, marked up

Parts of an SGML document n Document instance – The document itself, marked up following the SGML usage conventions specified in the SGML declaration and the document type definition specified in the DTD SGML 25

Newsletter DTD 1 <!DOCTYPE newsletter [ … SGML 2 <!ELEMENT newsletter - - (admin-sect,

Newsletter DTD 1 <!DOCTYPE newsletter [ … SGML 2 <!ELEMENT newsletter - - (admin-sect, body) 3 <!ATTLIST newsletter newslname CDATA #REQUIRED volume NUMBER #REQUIRED edition NUMBER #REQUIRED date CDATA #REQUIRED> 4 <!ELEMENT admin-sect - - (masthead, ed-board)> 5 <!ELEMENT masthead - - (%text; )> 6 <!ELEMENT ed-board - - (%text; )> 7 <!ELEMENT body - - (article+)> 8 <!ELEMENT article - - (title, author? , article-body, sidebar+)> 9 <!ELEMENT title - - (#PCDATA)> 26

Newsletter DTD 10 11 12 13 <!ELEMENT author - - (#PCDATA)> article-body - -

Newsletter DTD 10 11 12 13 <!ELEMENT author - - (#PCDATA)> article-body - - (section+)> section - - (title? , %text; )> sidebar - - (title, %text; )> … ]> n All items – Double hyphens n SGML Start tag and end tag are mandatory 27

Newsletter DTD n Item 1 – Start of DTD – Identifies document type as

Newsletter DTD n Item 1 – Start of DTD – Identifies document type as a newsletter – Skipped items we’ll get to later SGML 28

Newsletter DTD n Item 2 – Defines highest level element that will occur in

Newsletter DTD n Item 2 – Defines highest level element that will occur in document instance – Its generic identifier is “newsletter” – Items in parenthesis is the content model n SGML Defines a newsletter as containing only an administrative section (admin-sect) and a body 29

Newsletter DTD n Item 3 – Defines attributes of newsletter element type – Descriptive

Newsletter DTD n Item 3 – Defines attributes of newsletter element type – Descriptive properties of newsletter – CDATA is character data – Consider <!ATTLIST newsletter #REQUIRED> n SGML newslname (health | felines | antiques) Actual text of name is supplied by processing program 30

Newsletter DTD n Item 5 – %text Parameter entity n Refers to another definition

Newsletter DTD n Item 5 – %text Parameter entity n Refers to another definition n n Item 7 –+ n Occurrence indicator – Others are *, ? SGML 31

Newsletter DTD n Item 9 – #PCDATA Parsed character data n String of characters

Newsletter DTD n Item 9 – #PCDATA Parsed character data n String of characters of any length n SGML 32

Recipe DTD 1 <!DOCTYPE … 2 <!ELEMENT matter)> 3 <!ELEMENT dedication toc)> 4 <!ELEMENT

Recipe DTD 1 <!DOCTYPE … 2 <!ELEMENT matter)> 3 <!ELEMENT dedication toc)> 4 <!ELEMENT publ 5 <!ELEMENT 6 <!ELEMENT 7 <!ELEMENT 8 <!ELEMENT 9 <!ELEMENT 10 <!ELEMENT SGML cookbook [ cookbook - - (front-matter, body, back- front-matter - - (publish-data, (foreword & &acknowledg), - - (title, subtitle? , author+, isbn, history, copyright-yr)> title - - (#PCDATA)> subtitle - - (#PCDATA)> author - - (#PCDATA)> isbn - - (#PCDATA)> publ-history - - (#PCDATA)> copyright-yr - - (#PCDATA)> publish-data 33

Recipe DTD 11 12 13 14 15 16 17 18 <!ELEMENT <!ELEMENT 19 <!ELEMENT

Recipe DTD 11 12 13 14 15 16 17 18 <!ELEMENT <!ELEMENT 19 <!ELEMENT 20 <!ELEMENT wrapup? )> 21 <!ATTLIST SGML foreword - dedication acknowledg toc -O body -gen-info - section - recipe-cat (title, %text; )> - - (#PCDATA)> - - (title, %text; )> EMPTY (gen-info, recipe-cat+)> (section+)> (title, %text; ? , section*)> - - (section? , (recipe-subcat+ | megarecipe+))> recipe-subcat - - (section? , megarecipe+)> megrecipe - - (title, recipe-intro, recipe+, megrecipe nutrition preptime CDATA servings NUMBER #IMPLIED NMTOKEN #IMPLIED> 34

Recipe DTD 22 <!ELEMENT 23 <!ELEMENT 24 <!ELEMENT 25 <!ELEMENT 26 <!ATTLIST 27 <!ELEMENT

Recipe DTD 22 <!ELEMENT 23 <!ELEMENT 24 <!ELEMENT 25 <!ELEMENT 26 <!ATTLIST 27 <!ELEMENT 28 <!ELEMENT 29 <!ELEMENT ]> SGML recipe-intro - - (%text; )> recipe - - (title? , recipe-intro? , ingredients, procedure, wrapup? )> ingredients - - (ingredient+)> ingredient - - (#PCDATA)> ingredient metricunit CDATA #IMPLIED metricquan NUMBER #IMPLIED> procedure - - (step+)> step - - (%text; )> wrapup - - (%text; )> 35

Recipe DTD n Item 3 –& n Connector – Others are | and ,

Recipe DTD n Item 3 –& n Connector – Others are | and , n n Any order Item 14 –O n SGML Can omit particular tag, in this case an end tag 36

Recipe DTD n Item 17 – Recursion n Item 22 – NMTOKEN Name token

Recipe DTD n Item 17 – Recursion n Item 22 – NMTOKEN Name token n Processing program should find value n – #IMPLIED Value doesn’t have to be given n Processing program chooses a default n SGML 37

Document Instance element <name attname="attvalue">content</name> start-tag SGML content end-tag 38

Document Instance element <name attname="attvalue">content</name> start-tag SGML content end-tag 38

Document Instance <newsletter newslname=felines volume=3 edition=6 date="18 December 1995"> <admin-sect> <masthead> <para>Published by the

Document Instance <newsletter newslname=felines volume=3 edition=6 date="18 December 1995"> <admin-sect> <masthead> <para>Published by the Aco Company, … </para> </masthead> <ed-board> <para>W. I. Grosky, … </para></ed-board></masthead> SGML 39

Document Instance <body> <article> <title>What I did on my Summer Vacation</title> <author>John Smith</author> <article-body>

Document Instance <body> <article> <title>What I did on my Summer Vacation</title> <author>John Smith</author> <article-body> <section> <para>I like the seashore…</para> <para>I like the mountains even better… </para></section> SGML 40

Document Instance <section> <title>Where I will go Next Year</title> <para>Europe is always nice in

Document Instance <section> <title>Where I will go Next Year</title> <para>Europe is always nice in the summer…</para></section></articlebody> <sidebar> <title>Skiing in the Alps</title> <para>Oops…</para></sidebar></article> <article>…</article></body></newsletter> SGML 41

SGML Declaration n How a document uses SGML – Character set – Concrete syntax

SGML Declaration n How a document uses SGML – Character set – Concrete syntax Delimiters n Name characters and lengths n – Optional features SGML 42

SGML System processing program SGML parser entry/edit composition etc. entity manager SGML declaration document

SGML System processing program SGML parser entry/edit composition etc. entity manager SGML declaration document instance program output SGML 43

Parser Read SGML declaration and establish rules for interpreting the rest of the document

Parser Read SGML declaration and establish rules for interpreting the rest of the document and applying SGML features n Read DOCTYPE declaration, verifying that it follows rules for declaring DTD's, and determine rules for interpreting document instance n SGML 44

Parser Read document instance and validate that it follows rules of DTD n Resolve

Parser Read document instance and validate that it follows rules of DTD n Resolve entity references n SGML 45

The Web and Hypermedia n HTML – Hypertext Markup Language – DTD of SGML

The Web and Hypermedia n HTML – Hypertext Markup Language – DTD of SGML n Hy. Time – Hypermedia/Time-based Structuring Language – DTD of SGML 46

Part of HTML DTD <!-========Images============ --> <!ELEMENT IMG -O EMPTY> <!ATTLIST IMG SRC CDATA

Part of HTML DTD <!-========Images============ --> <!ELEMENT IMG -O EMPTY> <!ATTLIST IMG SRC CDATA #REQUIRED ALT CDATA #IMPLIED ALIGN (top | middle | bottom) #IMPLIED ISMAP (ISMAP) #IMPLIED> <!- <IMG> Image; icon, glyph or illustration - -> <!- <IMG SRC="…"> Address of image object - -> <!- <IMG ALT="…"> Textual alternative - -> <!- <IMG ALIGN="…"> Position relative to text - -> <!- <IMG ISMAP> Each pixel can be a link - -> SGML 47