Microsoft Office Open XML Formats Brian Jones Lead

Microsoft Office Open XML Formats Brian Jones Lead Program Manager Microsoft Corporation

Agenda Overview of the new formats Demo: New Office document Additional benefits of the new formats Role of XML in Office documents Structural details of the new formats Custom defined schema support File format benefits for developers Solution capabilities demos throughout

Microsoft Office Open XML Formats New XML file formats for Word, Excel and Power. Point New formats will be the default file formats New file type extensions Open, transparent format improves interoperability Published file format specification with royaltyfree license Transparent, XML format enables new integration scenarios for documents and LOB systems ZIP container allows for standard compression on all files without user effort

Office Open XML Formats

Microsoft Office Open XML Formats Added Benefits: compact and robust ZIP container allows for standard compression on all files without user effort (Dramatic file size improvements) Significantly more robust files to help minimize data loss Backward Compatible: Office 2000, Office XP, Office 2003 will all support the new formats Patches for compatibility available by launch Open, edit and save new formats Legacy support: Current Office 97 -2003 binary file formats supported Support for XML formats from Office 2003, Office XP continued Developers: Endless potential for developers Build solutions to read, write, and modify Office files (without the need to run Office APIs)

Benefits of the Office Open XML Formats

The Role of XML in Office Capture and reuse of information Connecting users to data Intelligent applications

The Role of XML with Documents Scenario Document Assembly Server-based or user-assisted construction of documents from archived content or database content Content Reuse Much easier to move content between documents, including different document types Content Tagging Add domain-specific metadata to document content to enable custom solutions Document Interrogation Query document repositories based on custom data, content types or document metadata Example Create sales reports from financial and forecast data stored in a CRM system Apply content stored in Word documents to Web pages quickly and efficiently Tag presentations using a specific taxonomy to improve knowledge management efficiency Search for all documents containing a specific company name or sales contact Document Sanitization Remove unwanted content like comments or embedded code from your document when appropriate Remove all tracked changes and comments from a Word document before it is published

Evolution of File Formats Office 2003 Breakthrough XML Support Word. ML, Spreadsheet. ML Custom-defined schema “Wave 12” “Office 12” Office 2000 New XML Formats XML file format default XML Power. Point format Early Innovation XML document properties Office XP First XML Format Spreadsheet XML Office 97 Existing binary file formats designed in 1994, launched in Office 97

Open XML Formats Architecture User view: single Office “file” File Container Questionnaire. docx Document Properties Developer view: modular file Comments Document Parts Most parts are XML Each XML part is a discreet, compressed component Can add, extract and modify individual parts without using Office programs Corruption or absence of any part would not prohibit the file from being opened Word. ML / Spreadsheet. ML, etc. Custom-defined XML Images, video, sound Embedded code / macros Charts

Modifying an Excel Spreadsheet

Components of the new formats Package – ZIP Container Part – The “files” inside the ZIP Content Types – Each part has a content type that is enforced on open Relationships – Any part that references another part must do so via a relationship

Create a document from scratch

The Role of XML Reference and custom-defined schemas XML Reference Schemas Display-oriented (e. g. Bold, Italics, Tables, Paragraphs, Styles) Open Document Format Enable Archival & File Formats Interoperability Custom-defined Schemas Data-oriented (e. g. , Price, Invoice) Represents the business information stored in the document Enable System Integration

The Role of XML Reference and custom-defined schemas XML Reference Schemas Display-oriented (e. g. Bold, Italics, Tables, Paragraphs, Styles) Open Document Format Enable Archival & File Formats Interoperability <w: p> <w: r. Pr><w: b /></w: r. Pr> <w: t>John Doe</w: t> </w: r> <w: r. Pr><w: i /></w: r. Pr> <w: t>Health Agency</w: t> </w: r> </w: p>

The Role of XML Reference and custom-defined schemas <Conference. Report> <Date>3/24/2004</Date> <Attendees> <Attendee Name=“John Doe”> <Department> Health Agency </Department> <Potential> <Sales>100</Sales> <Growth>25%</Growth> … </Attendee> Custom-defined Schemas Data-oriented (e. g. , Price, Invoice) Represents the business information stored in the document Enable System Integration

Custom defined schema

Developing against the formats More Reliable Solutions 3 rd party tools were main cause of document corruptions Fully Documented Formats Freely available for download with a royalty free license Office file format schemas - Used to validate content for a given part Samples, samples In the form of code “snippets” for easier use and integration into your VSTO solutions Win. Fx Package APIs Access/maintain parts and relationships within a file Takes care of all ZIP level functionality XPath Navigation within content XML DOM Manipulating content Office Open XML Resource Kit Tools for constructing and deconstructing the new file formats Design time Validation tool Parses a file and reports on schema, relationship errors and warnings Runtime serialization tool Flattens package into a single file for ease of development in simple construction scenarios

Programming against the formats

Sample Solution Scenarios Data interoperability Content manipulation Content sharing and reuse Document assembly Document security Managing sensitive information Document styling Document profiling

Resources Office Preview Site: http: //www. microsoft. com/office/preview/ Brian Jones’s Blog: http: //blogs. msdn. com/Brian_Jones/ Office 2003 Reference Schema Information: http: //www. microsoft. com/office/xml/

© 2005 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
- Slides: 22