Web Futures Part 1 Standards Part 2 Deployment

Web Futures Part 1 – Standards Part 2 – Deployment Issues Brian Kelly UKOLN University of Bath, BA 2 7 AY Email B. Kelly@ukoln. ac. uk URL http: //www. ukoln. ac. uk/ UKOLN is supported by: A centre of expertise in digital information management www. ukoln. ac. uk

Contents • Introduction • Standards • The Original Web Architecture • Architectural Developments • Deployment Issues • Discussion Aims of Talk • To give brief overview of Web architecture • To describe developments to Web standards • To briefly address implementation models A centre of expertise in digital information management 2 www. ukoln. ac. uk

Standards in an Educational / Research Context Standards are important in an educational and research context to: • • • Ensure widespread access to resources Enables resources to be reused and repurposed Ensure scholarly resources can be preserved Address accountability of public funding Minimise resource costs for upgrading systems Provide universal access to resources (cf disability legislation) A centre of expertise in digital information management 3 www. ukoln. ac. uk

Standards 4 Standards Before the Web Access to resources typically required use of software vendor’s software – which was only available on limited no. of platforms. Often the software would be licensed. The goal of the Web was to provide universal access to resources. Who could argue with this goal? Need for standards to provide: • Platform and application independence • Avoidance of patented technologies • Flexibility and architectural integrity • Long-term access to data Ideally look at standards first, then find applications which support the standards. However it can be difficult A centre of expertise in digital information management www. ukoln. ac. uk to achieve this ideal!

Standards 5 Standards and the Web HTML extensions PDF and Java? Proprietary • De facto standards • Often initially appealing (cf Power. Point, PDF) PNG • May emerge as W 3 C HTML ISO standards • Produces W 3 C • Produces ISO Z 39. 50 Recommendations Java Standards on Web protocols • Can be slow moving • Managed approach to and bureaucratic developments • Produce robust • Protocols initially standards IETF developed by • Produces Internet W 3 C members Drafts on Internet protocols • Decisions made by • Bottom-up approach to developments W 3 C, informed HTTP • Protocols may be developed by member & URN by interested individuals public review • "Rough consensus and working whois++ A centre XML, of expertise in digital management www. ukoln. ac. uk HTML, PNG, … information code"

Standards The Case For W 3 C Standards Why use open standards developed by the W 3 C? Why not leave it to the marketplace? J W 3 C’s open standards have been developed in an open environment, with the aim of achieving platform and application independency L Commercial companies develop proprietary formats in order to maximise their profits and dividends to shareholders J W 3 C’s open standards have been developed to interoperate with each other according to W 3 C’s design vision L Commercial companies typically develop proprietary formats in isolation, or along the lines of a company vision A centre of expertise in digital information management 6 www. ukoln. ac. uk

Standards, Architectures, Applications, Resources This talk touches on several areas Standards: concerned with protocols and file formats Open standards vs. Proprietary HTML / XML vs. PDF CSS / XSL vs. HTML GIF vs PNG Applications: software products used to implement systems Architectures: models for implementing systems Which standards are applicable NT / Unix File system / database application HTML tools / content management Resources: financial and staff costs needed to implement systems Apache / IIS Development vs. Migration costs Front. Page / Dreamweaver Use of in-house expertise Oracle / SQLServer In-house vs. out-sourced Cold. Fusion vs ASP Licensed vs. open source A centre of expertise in digital information management www. ukoln. ac. uk 7

Standards GIF As an example of the dangers of use of proprietary solutions, consider the GIF file format: • Unisys announce that they hold patent to compression algorithm used in GIF images and users of GIF will have to pay • Following much debate, Unisys require payment for licence from software developers - and also for end users of unlicensed software ($5, 000!) • Web community responds with PNG format • See <http: //burnallgifs. org/> WARNING: • There is no guarantee that payment will not be required for proprietary file formats which are currently free A centre of expertise in digital information management 8 www. ukoln. ac. uk

How Does The Web Work? The Web has three fundamental concepts: • URLs: addresses of resources • HTTP: dialogue between client and server • HTML: format of resources Web Browser The Netsoft home page Welcome to Netsoft 9 1 User clicks on link to the address (URL) http: //www. netsoft. com/hello. html 2 Browser converts link to HTTP command (METHOD): Connect to computer at www. netsoft. com GET /hello. html 3 Remote computer sends file Web server A centre of expertise in digital information management 4 Local computer displays HTML <HTML> <TITLE>Welcome</TITLE>. . <P>The <A HREF=“…”> Netsoft</A> home page</P> file www. ukoln. ac. uk

Data Formats Approaches To HTML Emphasis on managing HTML resources inappropriate: • HTML is an output format, which cannot easily be reused (e. g. WAP, e-Books, etc. ) • Need to manage HTML fragments (only partly achievable with SSIs) • Need to manage collections of resources • Need to have single master source of data • Need to support new developments such as personalisation • Difficult to integrate with new formats Issues • Should we stop giving HTML training courses? • Should weinstop HTML authoring tools? A centre of expertise digital buying information management 10 www. ukoln. ac. uk

Data Formats XML: • • Extensible Markup Language A lightweight SGML designed for network use Addresses HTML's lack of evolvability Arbitrary elements can be defined (<STUDENTNUMBER>, <PART-NO>, etc) • Agreement achieved quickly - XML 1. 0 became W 3 C Recommendation in Feb 1998 • Support from industry (SGML vendors, Microsoft, etc. ) • Support in latest versions of Web browsers A centre of expertise in digital information management 11 www. ukoln. ac. uk

Data Formats XML Concepts (1) Well-formed XML resources: Make end-tags explicit: <li>. . . </li> Make empty elements explicit: <img. . . /> Quote attributes <img src="logo. gif" height="20" Use consistent upper/lower case <p> and <P> are different XML Namespaces: Mechanism for ensuring unique XML elements: <? xml: namespace ns="http: //foo. org/ 1998 -001" prefix="i"> <p>Insert <i: PART>M-471</i: PART></p> A centre of expertise in digital information management 12 www. ukoln. ac. uk

Data Formats XML Concepts (2) XML Schemas • Allow constraints to be applied on XML attributes • Express shared vocabularies and allow machines to carry out rules made by people • Richer than DTDs • See <http: //www. w 3. org/XML/Schema> XSLT • A language for transforming XML from one DTD to another, or to another format (e. g. PDF) • Written in XML • Knows about XML (e. g. tree structures, etc. ) • See <http: //www. xslt. com/> A centre of expertise in digital information management 13 www. ukoln. ac. uk

Data Formats XML Concepts (3) England France • Links that allow you to choose multiple destinations • Bidirectional links • Links with special behaviours: XLink provides sophisticated hyperlinking: • Expand-in-place / Replace / Create new window • Link on load / Link on user action • Link databases • See <http: //www. xml. com/pub/a/2000/09/xlink/> XPointer • Provides access to arbitrary portions of XML resource • See <http: //www. devshed. com/ Server_Side/XML/XPointer/page 1. html> A centre of expertise in digital information management 14 www. ukoln. ac. uk

Data Formats Getting to XML With XHTML: • HTML represented in XML • Some small changes to HTML: Elements in lowercase <p> not <P> Attributes must be quoted <img src="logo" height="50"> Elements must be closed: < p >. . . </ p >) <img src="logo". . . /> • Gain benefits from XML • Tools available (e. g. HTML-Kit from http: //www. chami. com/html-kit/) • See <http: //www. webreference. com/xml/ column 6/>, <http: //groups. yahoo. com/ group/XHTML-L/> and <http: //www. ariadne. ac. uk/ issue 27/web-focus/> Note the IWMW 2002 Web site is (mostly) XHTML www. ukoln. ac. uk A centre of expertise in digital information management 15

Data Formats CSS: • Cascading Style Sheets • XHTML/XML defines structure, CSS describes the appearance • CSS 1. 0 and 2. 0 now W 3 C recommendations • CSS 3. 0 in preparation (modularised) • We should be using CSS: Part of architecture Ease of maintenance Becoming much richer Accessibility • See <http: //www. w 3 c. org/Style/CSS/> A centre of expertise in digital information management 16 www. ukoln. ac. uk

Data Formats SVG: • Scalable Vector Graphics • A language for describing two-dimensional graphics in XML • See <http: //www. w 3. org/Graphics/SVG/ Overview. htm 8> • Also see presentation on XML written in SVG at <http: //www. w 3 c. org/Talks/2001/12/ IH-Euroweb/W 3 CIn. The. Worldslide. svgz> • WWW 2002 talk at <http: //www. w 3 c. org/2002/Talks/ www 2002 -SVG/> A centre of expertise in digital information management 17 www. ukoln. ac. uk

Data Formats A centre of expertise in digital information management 18 www. ukoln. ac. uk

Data Formats SVG Example http: //www. karto. ethz. ch/neumann/cartography/vienna/ www. ukoln. ac. uk A centre of expertise in digital information management 19

Data Formats SVG and XSLT This example: http: //people. w 3. org/maxf/Chess. GML/ • Originally written in Java • Author realised that XSLT would be easier • Uses SVG for chess board and pieces • Uses XSLT to move pieces A centre of expertise in digital information management 20 www. ukoln. ac. uk

CML, SVG and XSLT http: //www. adobe. com/svg/demos/cml 2 svg/html/index. html A molecule described in CML can be transformed using XSLT into SVG, allowing it to be displayed and manipulated A centre of expertise in digital information management 21 www. ukoln. ac. uk

Data Formats SMIL: • Synchronized Multimedia Integration Language • A language for authoring of interactive audiovisual presentations • Allows you to synchronize text, images, audio and video in a document • An XML Application • See <http: //www. w 3 c. org/Audio. Video/> A centre of expertise in digital information management 22 www. ukoln. ac. uk

http: //www. reseau. it/smilapp_en. html SMIL Example http: //www. kevlindev. com/tutorials/ basics/animation/svg_smil/index. htm A centre of expertise in digital information management 23 www. ukoln. ac. uk

Math. ML: • An XML application for maths • Various plugins, dedicated readers, etc. • Mozilla renders natively See <http: //www. mozilla. org/projects/mathml/> A centre of expertise in digital information management 24 www. ukoln. ac. uk

Data Formats Modularisation How can you: • Include XML resources such as Math. ML, Chem. ML, etc in XHTML documents? • Provide a subset of XHTML features in browsers on devices such as mobile phones, PDAs, etc. ? The answer is: • XHTML modularisation (modularization ) • See <http: //www. w 3. org/TR/xhtml-modularization/> and <http: //www. xml. com/pub/a/2002/01/16/ xhtml-m 12 n. html> A centre of expertise in digital information management 25 www. ukoln. ac. uk

Addressing (1) URLs have limitations: • Lack of long-term persistency Univ. changes name or department shut down or merged Directory structure reorganised • Inability to support multiple versions (mirroring) URIs: • Were an address of a resource – and moving a resource was annoying but not critical • With the development of “Web services”, structured resources, B 2 B communications, etc. the availability of URIs will be of great importance A centre of expertise in digital information management 26 www. ukoln. ac. uk

Addressing (2) Solutions: • Unique identifiers possible, but resolution difficult • Solutions include DOIs, PURLs, Open. URLs, etc. • Interest mostly in publishing sector • "URIs don’t break - people break them" • Think about URL persistency & naming guidelines: <http: //www. ariadne. ac. uk/issue 31/ web-focus/> A centre of expertise in digital information management 27 www. ukoln. ac. uk

Transport - The Original Roadmap HTTP/0. 9 and HTTP/1. 0: L Design flaws and implementation problems HTTP/1. 1: J Addresses some of these problems J 60% server support J Performance benefits! (60% packet traffic reduction) K Is acting as fire-fighter L Not sufficiently flexible or extensible HTTP/NG: J Radical redesign using object-oriented technologies J Undergoing trials J Gradual transition (using proxies) A centre of expertise in digital information management 28 www. ukoln. ac. uk

Transport - Today: • Responsibility for development moved from W 3 C to IETF • Little progress with HTTP/NG • Problems with HTTP/1. 1: Lengthy (176 -page) specification without much explicit rationale for design decisions Environment has become more complex Lack of a clean underlying data model … • See “Clarifying the Fundamentals of HTTP” <http: //www 2002. org/CDROM/refereed/444/> A centre of expertise in digital information management 29 www. ukoln. ac. uk

SOAP: • Simple Object Access Protocol • Facilitates development of machine-to-machine communications using Web protocols by providing a richer XML-based messaging mechanism • A protocol for invoking methods on servers, services, components and objects • Codifies existing practice of using XML and HTTP as a method invocation mechanism • See FAQ at <http: //www. develop. com/soap/ soapfaq. htm> A centre of expertise in digital information management 30 www. ukoln. ac. uk

Metadata - the missing architectural component from the initial implementation of the web DF R a t , a N d C ta T Addressing e , , g M ICS i P F, DS URL MC C, . . . D Metadata Needs: Transport Data format • Resource discovery HTTP HTML • • • 31 Content filtering Authentication Improved navigation Multiple format support Rights management A centre of expertise in digital information management www. ukoln. ac. uk

Metadata Examples DSig (Digital Signatures initiative): • Key component for providing trust on the web • DSig 2. 0 will be based on RDF and will support signed assertion: • This page is from the University of Bath • This page is a legally-binding list of courses provided by the University P 3 P (Platform for Privacy Preferences): • Developing methods for exchanging Privacy Practices of Web sites and user Note that discussions about additional rights management metadata are currently taking place A centre of expertise in digital information management 32 www. ukoln. ac. uk

Metadata RDF (Resource Description Framework): • Highlight of WWW 7 conference • Provides a metadata framework ("machine understandable metadata for the web") • Based on ideas from content rating (PICS), resource discovery (Dublin Core) and site mapping (MCF) • Applications include: cataloging resources resource discovery electronic commerce intelligent agents digital signatures content rating intellectual property rights privacy • See <URL: http: //www. w 3. org/ Talks/1998/0417 -WWW 7 -RDF> A centre of expertise in digital information management 33 www. ukoln. ac. uk

RDF Model RDF Data Model RDF: • Based on a formal data model (direct label graphs) • Syntax for interchange of data • Schema model page. html Cost Resource Property. Type Property page. html £ 0. 05 Prop. Obj Instance. Of Property Prop. Name Cost £ 0. 05 Valid. Until 23 -Mar-99 Value Valid. Until 23 -Mar-99 A centre of expertise in digital information management 34 Value www. ukoln. ac. uk

Metadata RDF Conclusion · RDF is a general-purpose framework · RDF provides structured, machineunderstandable metadata for the Web · Metadata vocabularies can be developed without central coordination · RDF Schemas describe the meaning of each property name · Signed RDF is the basis for trust But: • Is it too complex? • Is it the right approach? A centre of expertise in digital information management 35 www. ukoln. ac. uk

RSS – An XML/RDF Application RSS (Rich / RDF Site Summary): • Initially XML, now an RDF application • Used for news feeds • Lightweight approach that we should be investigating (e. g. see news page on IWMW 2002 Web site) See example of an RSS authoring tool and parser at <http: //rssxpress. ukoln. ac. uk/> A centre of expertise in digital information management 36 www. ukoln. ac. uk

Model For News Feeds Community (e. g. MIDAS) Institution (e. g. Bath) RSS Zope CMS outputs to RSS & XHTML converted to RSS Local News. . JISC News. . National News RSS External (e. g. BBC) Structured database converted to RSS A centre of expertise in digital information management 37 RSS Good For User The end user can choose her news feeds, including local news, news from JISC services and news from third parties Good For Service The service can chose its own information flow model. Its news is disseminated automatically. www. ukoln. ac. uk

Futures What About Tomorrow? Two interesting areas: The Semantic Web • Will allow intelligent agents to know about resources • AI and ontologists meet the Web • Uses RDF (Resource Description Framework) – W 3 C’s framework for metadata • Some concerns over scale of problem • See <http: //www. w 3. org/2001/sw/> Web Services • Highlight of the WWW 10 and WWW 2002 conferences A centre of expertise in digital information management 38 www. ukoln. ac. uk

Futures Web Services The Web: • Initially used for viewing static resources • Then interactive services built (e. g. e-learning) We now want: • Programmable Web services which can be used by other Web services using standards Web protocols We have experience of the first generation of externally-hosted Web services (stats services, voting systems, etc. ) - see <http: //www. ariadne. ac. uk/issue 23/web-focus/>. The next generation will be programmable and machineunderstandable Note that concerns over outsourcing may be an issue A centre of expertise in digital information management 39 www. ukoln. ac. uk

Example Some examples at gotdotnet. com: • Mailsender • Thumbnail Generator Concepts been around for some time Now being standardised (UDDI, WSDL, http: //www. gotdotnet. com/playground/ services/thumbnailgen. aspx SOAP, A centre …) of expertise in digital information management www. ukoln. ac. uk 40

Futures We’ve Been Here Before Reusable components available on the network: • Sounds like COM/DCOM, CORBA, etc. for reusable program components Network services for use within a community: • Sounds like JISCmail, RDN, EDINA, MIMAS, BIDS, Mirror Service and other JISC Services • It’s outsourcing – but it’s OK! Web Services And UK HE / FE Communities Sounds like a great idea: • We’ve the organisational framework to develop national services (JISC, etc. ) • We’ve got the network • We’ve a community which is willing to exploit centrally-provided services and wants to avoid reinventing the wheel (haven’t we? ) A centre of expertise in digital information management 41 www. ukoln. ac. uk

UK HE Example - Currently. . . Local content Web National content Web We should be moving away from providing separate Web services with their own interfaces … A centre of expertise in digital information management 42 International content Web End user www. ukoln. ac. uk

Currently. . . Local content Web National content Web Collection Description (e. g. Agora) Web Web User Profile (e. g. Headline) Authentication (Athens) … and separate metadata repositories and access services (which are sometimes centralised) … A centre of expertise in digital information management 43 International content Agora and headline are e. Lib Hybrid libraries End user www. ukoln. ac. uk

Future. . . Metadata Services / Access (Web) Services Content Application Services? Collection description User profile Authentication Brokered access provide by institutional portal (MLE, …) . . and move to Web-accessible, machine-understandable Web services as well as seamless access to content A centre of expertise in digital information management 44 Bookmarks Spellchecker End user www. ukoln. ac. uk

Other W 3 C Areas See • W 3 C site map at <http: //www. w 3 c. org/Help/siteindex> • Tim. BL’s Web Design Issues at <http: //www. w 3 c. org/Design. Issues> • Web Architecture from 50, 000 feet at <http: //www. w 3. org/Design. Issues/Architecture. html> A centre of expertise in digital information management 45 www. ukoln. ac. uk

Web Futures Part 2 –Deployment Issues A centre of expertise in digital information management 46 www. ukoln. ac. uk

Deployment Architectures Let us consider the following areas: • Content Management • Systems Architecture • Access (Browser support) A centre of expertise in digital information management 47 www. ukoln. ac. uk

Deployment Position Today What should we be doing today? • Move away from creating new content in HTML • Move to XHTML as part of the migration • Deploying XML applications • Storing structured information in a neutral database • Using a CMS to manage our content • Deploying B 2 B applications to avoid human bottleneck (such as RSS) Note that these are aspirations. We will, of course, be constrained by existing systems, resource implications, vested interests, inertia, etc. A centre of expertise in digital information management 48 www. ukoln. ac. uk

Deployment 49 The CMS To The Rescue HTML authoring tools have limitations (as has HTML). A CMS (Content Management System): • Allows fragments to be managed • Allows collections to be managed • Allows resources to be stored in a neutral format (backend database) • Allows resources to be reused • Often provides access control • Often provides workflow processes and project management Issues • CMS can be expensive • CMS can be free but have support implications A centre of expertise in digital information management • Which one to choose? www. ukoln. ac. uk

Storing resources in HTML and GIF/JPEG is: J Easy to do and is a low cost solution L Makes reuse and management of resources difficult WML XML TIFF / …. On-the-fly or batch conversion HTML User-agent Negotiation Deployment Content Management GIF / JPEG Content Management System for: 50 • Management of content (content maintenance, metadata management, access rights, project management, …) • Delivery of content (e. g. user-agent negotiation, alternative file A centre of expertise in digital information management www. ukoln. ac. uk formats [such as WML], etc. ))

Deployment Systems Architecture Issues for you to consider: • Operating System: Should you go for a Unix OS or Windows NT? If Unix, should you go for Linux? • Open Source vs Licensed Solution: Should you go for an open source solution or buy a licensed application? • Package vs Do It Yourself: Should you make use of a pre-packages solution or develop your own solution based on a toolkit (e. g. database, scripting language, …)? There are no global solutions – your choice should be based on expertise available locally, resourcing issues, discussions with partners, solutions provider, etc. A centre of expertise in digital information management www. ukoln. ac. uk 51

Browser Issues Which approach to browser issues should you take? Web sites should be usable to old browsers as these are still in use and we aim to maximise access. Therefore you should deliver HTML 3. 2 / 4. 0 and avoid technologies such as Java. Script and CSS. NOTE Old browsers are broken and fail to • Use of ‘clean’ HTML should implement new technologies which degrade gracefully provide (a) richer functionality (b) • XHTML is a useful transition support for new devices and (c) better support for people with • User-agent negotiation may be relevant disabilities. QUESTION Therefore you should use the latest • Should organisations / stable versions of HTML (XHTML), community implement a CSS, etc. A centre of expertise in digital information management www. ukoln. ac. uk browser policy? 52

Questions Any questions? A centre of expertise in digital information management 53 www. ukoln. ac. uk

Questions Any questions? A centre of expertise in digital information management 54 www. ukoln. ac. uk
- Slides: 54