New Standards on the Web Brian Kelly Email
New Standards on the Web Brian Kelly Email Address UK Web Focus B. Kelly@ukoln. ac. uk UKOLN URL University of Bath http: //www. ukoln. ac. uk/ 1 UKOLN is funded by the British Library Research and Innovation Centre, the Joint Information Systems Committee of the Higher Education Funding Councils, as well as by project funding from the JISC’s Electronic Libraries Programme and the European Union. UKOLN also receives support from the University of Bath where it is based.
Contents • Introduction • Web Standards Overview • Web Standards: • Data Formats • Transport • Addressing • Metadata • Deployment Issues 2 Aims of Talk • To give brief overview of web architecture • To describe developments to web standards (especially those relevant to library community) • To briefly address implementation models Due to lack of time, talk will not cover some new standards, such as: • Graphics • Multimedia • e-commerce
Standardisation HTML extensions PDF and Java? W 3 C 3 • Produces W 3 C Recommendations on Web protocols • Managed approach to developments • Protocols initially developed by W 3 C members • Decisions made by W 3 C, influenced by member and public review PNG HTML HTTP Proprietary • De facto standards • Often initially appealing (cf Power. Point, PDF) PNG • May emerge as HTML ISO standards • Produces ISO Z 39. 50 Java? Standards • Can be slow moving and bureaucratic • Produce robust IETF standards • Produces Internet Drafts on Internet protocols • Bottom-up approach to developments • Protocols developed by HTTP interested individuals URN • "Rough consensus and working whois++ code"
The Web Vision Tim Berners-Lee's vision for the Web: • Evolvability is critical • Automation of information management: If a decision can be made by machine, it should • All structured data formats should be based on XML • Migrate HTML to XML • All logical assertions to map onto RDF model • All metadata to use RDF See keynote talk at WWW 7 conference at <URL: http: //www. w 3. org/Talks/1998/ 0415 -Evolvability/slide 1 -1. htm> 4
HTML 4. 0, CSS 2. 0 and DOM HTML 4. 0 used in conjunction with CSS 2. 0 (Cascading Style Sheets) and the DOM provides an architecturally pure, yet functionally rich environment HTML 4. 0 - W 3 C-Rec • Improved forms • Hooks for stylesheets • Hooks for scripting languages • Table enhancements • Better printing Problems • Changes during CSS development • Netscape & IE incompatibilities • Continued use of browsers with known bugs 5 CSS 2. 0 - W 3 C-Rec • Support for all HTML formatting • Positioning of HTML elements • Multiple media support DOM - W 3 C-Rec • Document Object Model • Hooks for scripting languages • Permits changes to HTML & CSS properties and content
HTML Limitations HTML 4. 0 / CSS 2. 0 have limitations: • Difficulties in introducing new elements – Time-consuming standardisation process (<ABBREV>) – Dictated by browser vendor (<BLINK>, <MARQUEE>) • Area may be inappropriate for standarisation: – Covers specialist area (maths, music, . . . ) – Application-specific (<STUD-NUM>) • HTML is a display (output) format • HTML's lack of arbitrary structure limits functionality: 6 – Find all memos copied to John Smith – How many unique tracks on Jackson Browne CDs
XML Extensible Markup Language): • A lightweight SGML designed for network use • Addresses HTML's lack of evolvability • Arbitrary elements can be defined (<STUDENTNUMBER>, <PART-NO>, etc) • Agreement achieved quickly - XML 1. 0 became W 3 C Recommendation in Feb 1998 • Support from industry (SGML vendors, Microsoft, etc. ) • HTML is being described in HTML - see <URL: http: //www. w 3. org/TR/WD-html-in-xml/> 7
XML Support XML document with no style sheet - XML tree displayed XML support: • Can be provided at backend • (Partial) XML support in IE 5 • Also in Netscape 5? XML document with style sheet 8 http: //www. xml. com/1999/03/ie 5/first-x. xml
XLink, XPointer and XSL XLink will provide sophisticated England hyperlinking missing in HTML: France • Links that lead user to multiple destinations • Bidirectional links • Links with special behaviours: – Expand-in-place / Replace / Create new window – Link on load / Link on user action <commentary xml: link="extended" inline="false"> • Link databases <locator href="smith 2. 1" role="Essay"/> <locator href="jones 1. 4" role="Rebuttal"/> XPointer will provide <locator href="robin 3. 2" role="Comparison"/> access to arbitrary </commentary> portions of XML resource XSL stylesheet language will provide extensibility and transformation facilities (e. g. create a table of contents) 9
Addressing URLs have limitations: • Lack of long-term persistency – Organisation changes name – Department shut down or merged – Directory structure reorganised • Inability to support multiple versions of resources (mirroring) Solutions: • Unique identifiers possible, but resolution difficult • Solutions include DOIs, PURLs, etc. • "URLs don’t' break - people break them". Think about URL persistency and naming guidelines 10
Transport HTTP/0. 9 and HTTP/1. 0: L Design flaws and implementation problems HTTP/1. 1: J J J K L Addresses some of these problems 60% server support Performance benefits! (60% packet traffic reduction) Is acting as fire-fighter Not sufficiently flexible or extensible HTTP/NG: J J 11 Radical redesign using object-oriented technologies Undergoing trials Gradual transition (using proxies) Integration of application (distributed searching? )
Metadata - the missing architectural component from the initial implementation of the web DF R a t , a N d C ta T Addressing e , , g M ICS i P F, DS URL MC C, . . . D Metadata Needs: Transport Data format • Resource discovery HTTP HTML 12 • • • Content filtering Authentication Improved navigation Multiple format support Rights management
Metadata Examples DSig (Digital Signatures initiative): • Key component for providing trust on the web • DSig 2. 0 will be based on RDF and will support signed assertion: – This page is from the University of Bath – This page is a legally-binding list of courses provided by the University P 3 P (Platform for Privacy Preferences): • Developing methods for exchanging Privacy Practices of Web sites and user Note that discussions about additional rights management metadata are currently taking place 13
RDF (Resource Description Framework): • Highlight of WWW 7 conference • Provides a metadata framework ("machine understandable metadata for the web") • Based on ideas from content rating (PICS), resource discovery (Dublin Core) and site mapping (MCF) • Applications include: – – 14 cataloging resources – resource discovery electronic commerce – intelligent agents digital signatures – content rating intellectual property rights – privacy • See <URL: http: //www. w 3. org/ Talks/1998/0417 -WWW 7 -RDF>
RDF Model RDF Data Model RDF: • Based on a formal data model (direct label graphs) • Syntax for interchange of data • Schema model page. html Cost Resource Property Prop. Name 15 Cost Value Property page. html £ 0. 05 Prop. Obj Instance. Of Property. Type Value Valid. Until 23 -Mar-99 Cost £ 0. 05 Valid. Until 23 -Mar-99
Browser Support for RDF Trusted Mozilla (Netscape's 3 rd source code release) Party provides support for Metadata RDF. Mozilla supports site maps in RDF, as well as bookmarks and history lists Embedded See Netscape's or Metadata Hot. Wired home page e. g. sitemaps for a link to the RDF file. Image from http: //purl. oclc. org/net/eric/talks/www 7/devday/ 16
RDF Conclusion · RDF is a general-purpose framework · RDF provides structured, machineunderstandable metadata for the Web · Metadata vocabularies can be developed without central coordination · RDF Schemas describe the meaning of each property name · Signed RDF is the basis for trust 17
Deployment Issues How can new technologies be deployed? • Expect (hope) everyone will move to new browsers • Use technologies in backwardscompatible manner • Develop additional protocols e. g. – Transparent Content Negotiation – CC/PP • User-Agent Negotiation • Use of proxy intermediaries 18
Deployment Issues More sophisticated deployment techniques can be adopted to overcome deficiencies in simple model Original Model HTML resource Web server browser Sophisticated Model HTML / XML / database resource Intelligent Web server Intermediaries can provide functionality not available at client: • DOI support • XML support / format conversion 19 • Authentication Web server simply sends file to client File contains redundant information (for old browsers) plus client interrogation support Client proxy browser Server proxy Example of an intermediary
Conclusions To conclude: • Standards are important, especially for national initiatives and other large-scale services • Proprietary solutions are often tempting because: – – They are available They are often well-marketed and well-supported They may become standardised Solutions based on standards may not be properly supported by applications • Metadata is big growth area • Intermediaries may have a role to play in deploying standards-based solutions • Intelligent servers likely to be important 20
- Slides: 20