Importing XML Importing and Splitting XML Data Date

  • Slides: 25
Download presentation
Importing XML Importing and Splitting XML Data Date: 9/30/2008 M D Metadata Solutions Dan

Importing XML Importing and Splitting XML Data Date: 9/30/2008 M D Metadata Solutions Dan Mc. Creary President Dan Mc. Creary & Associates dan@danmccreary. co Copyright 2008 Dan Mc. Crearym & Associates 1

Outline • Take a spreadsheet of data in Microsoft Excel • Use o. Xygen’s

Outline • Take a spreadsheet of data in Microsoft Excel • Use o. Xygen’s import Excel and import the data into and XML file • Save the XML file to a new collection and create a test XQuery using doc() • Create an XQuery that will split the single XML file into multiple XML files, one file per row • Create a new XQuery using the collection() function M D Copyright 2008 Dan Mc. Creary & Associates 2

Original Spreadsheet Headers M D Copyright 2008 Dan Mc. Creary & Associates 3

Original Spreadsheet Headers M D Copyright 2008 Dan Mc. Creary & Associates 3

Sample ISO Data Element Terms Sample of ISO 11179 Representation Terms • • •

Sample ISO Data Element Terms Sample of ISO 11179 Representation Terms • • • M D Date Code ID Indicator Name Text Document. Approval. Date Object Class Property Term See http: //en. wikipedia. org/wiki/Representation_term for a complete list Copyright 2008 Dan Mc. Creary & Associates 4

Column Headers with Representation Term Suffix M D Copyright 2008 Dan Mc. Creary &

Column Headers with Representation Term Suffix M D Copyright 2008 Dan Mc. Creary & Associates 5

Select the Sheet M D Copyright 2008 Dan Mc. Creary & Associates 6

Select the Sheet M D Copyright 2008 Dan Mc. Creary & Associates 6

M D Copyright 2008 Dan Mc. Creary & Associates 7

M D Copyright 2008 Dan Mc. Creary & Associates 7

M D Copyright 2008 Dan Mc. Creary & Associates 8

M D Copyright 2008 Dan Mc. Creary & Associates 8

M D Copyright 2008 Dan Mc. Creary & Associates 9

M D Copyright 2008 Dan Mc. Creary & Associates 9

M D Copyright 2008 Dan Mc. Creary & Associates 10

M D Copyright 2008 Dan Mc. Creary & Associates 10

Remove All Underscores M D Copyright 2008 Dan Mc. Creary & Associates 11

Remove All Underscores M D Copyright 2008 Dan Mc. Creary & Associates 11

Hints on Using Replace • Remember to include the “>” at the end to

Hints on Using Replace • Remember to include the “>” at the end to replace all begin and end tags • Replace “Old. Name>” with “New. Name>” • You can selectively use case • You can use regular expressions • You can use XPath expressions M D Copyright 2008 Dan Mc. Creary & Associates 12

Final Result <countries> <country> <Country. Name>AFGHANISTAN</Country. Name> <Primary. Author>Chris</Primary. Author> <Start. Date>38795</Start. Date> <First.

Final Result <countries> <country> <Country. Name>AFGHANISTAN</Country. Name> <Primary. Author>Chris</Primary. Author> <Start. Date>38795</Start. Date> <First. Draft. Date>38803</First. Draft. Date> <First. Review. Date>38815</First. Review. Date> <Author. Redraft. Date>38829</Author. Redraft. Date> <Second. Reviewer. Name>Amy</Second. Reviewer. Name> <Second. Review. Date>38858</Second. Review. Date> <Second. Review. Final. Date>38872</Second. Review. Final. Date> <Compleation. Date>38885</Compleation. Date> </country> … M D Copyright 2008 Dan Mc. Creary & Associates 13

Splitting Large Files • Many large XML files can be considered collections of uniform

Splitting Large Files • Many large XML files can be considered collections of uniform data structures – A spreadsheet is a collection of “row” objects • It is sometimes advantageous to put each object in its own XML file – Record Locking: Allows a file to be locked by a single user for editing – Metadata: The database automatically can keep track of when these records were created and when they were modified (document-metadata) – Versioning: Each object can have its own version history M D Copyright 2008 Dan Mc. Creary & Associates 14

Missing Updates Problem Editing Item 47 Bob Alice Bob Opens Edit Form Editing Item

Missing Updates Problem Editing Item 47 Bob Alice Bob Opens Edit Form Editing Item 47 Alice Opens Edit Form Bob Saves Changes Alice Overwrites Bob’s Changes • What happens when two people load a record into the web client at the same time? • If Alice reads the record before the Bob does his save, Bobs updates will be lost when Alice does her update time M D Copyright 2008 Dan Mc. Creary & Associates 15

Solution: Run File Through a “Splitter” • Use a small XQuery called a “splitter”

Solution: Run File Through a “Splitter” • Use a small XQuery called a “splitter” to break a single file into many small files • Visit each “row” and create a new file with the row’s content • Use a counter to create the file name: – 1. xml, 2. xml, 3. xml – This is known as an “artificial key” M D Copyright 2008 Dan Mc. Creary & Associates 16

Using the store function • All XML databases provide a function that allows you

Using the store function • All XML databases provide a function that allows you to store a new file into a collection • In e. Xist this is called xmldb: store() – xmldb: store($collection, $filename, $data) • Where: – $collection is the place you want to store the new file – $filename is the name of the file – $data is the data you want to put into the file M D Copyright 2008 Dan Mc. Creary & Associates 17

Basic splitter. xq Structure for $item at $count in doc($input-file)/root/row let $filename : =

Basic splitter. xq Structure for $item at $count in doc($input-file)/root/row let $filename : = concat($count, '. xml') let $result : = store($collection, $filename, $item) return $result te that file names are a concatenation of the count and the suffix “. xm M D Copyright 2008 Dan Mc. Creary & Associates 18

Testing Steps • Built a test splitter (splitter-test. xq) with the store line commented

Testing Steps • Built a test splitter (splitter-test. xq) with the store line commented out • Run the test to make sure the data is exactly the way you want it • Uncomment the line and save to splitter. xq • Run the script (only once) • View Results M D Copyright 2008 Dan Mc. Creary & Associates 19

xquery version "1. 0"; Doc Workflow Example let $input-file : = '/db/apps/doc-reviewer/import. xml' let

xquery version "1. 0"; Doc Workflow Example let $input-file : = '/db/apps/doc-reviewer/import. xml' let $data-collection : = '/db/apps/doc-reviewer/data' splitter-test. xq (: the login used must have write access to the collection : ) let $output-collection : = xmldb: login($data-collection, ‘login', ‘password') return <Split. Results>{ for $country at $count in doc($input-file)/countries/country let $filename : = concat($count, '. xml') (: let $store-return : = xmldb: store($data-collection, $filename, $country) : ) return <store> <id>{$count}</id> <command>store({$data-collection}, {$filename}, {$country})</command> <results></results> </store> }</Split. Results> Copyright 2008 Dan Mc. Creary & Associates 20

Test Run Results M D Copyright 2008 Dan Mc. Creary & Associates 21

Test Run Results M D Copyright 2008 Dan Mc. Creary & Associates 21

xquery version "1. 0"; Doc Workflow Example let $input-file : = '/db/apps/doc-reviewer/import. xml' let

xquery version "1. 0"; Doc Workflow Example let $input-file : = '/db/apps/doc-reviewer/import. xml' let $data-collection : = '/db/apps/doc-reviewer/data' splitter-test. xq (: the login used must have write access to the collection : ) let $output-collection : = xmldb: login($data-collection, ‘login', ‘password') return <Split. Results>{ for $country at $count in doc($input-file)/countries/country let $filename : = concat($count, '. xml') let $store-return : = xmldb: store($data-collection, $filename, $country) return <store> <id>{$count}</id> <command>store({$data-collection}, {$filename}, {$country})</command> <results>{$store-return}</results> </store> }</Split. Results> M D Copyright 2008 Dan Mc. Creary & Associates 22

Actual Run M D Copyright 2008 Dan Mc. Creary & Associates 23

Actual Run M D Copyright 2008 Dan Mc. Creary & Associates 23

Inspecting Results M D Copyright 2008 Dan Mc. Creary & Associates 24

Inspecting Results M D Copyright 2008 Dan Mc. Creary & Associates 24

Thank You! Please contact me for more information: • • • Native XML Databases

Thank You! Please contact me for more information: • • • Native XML Databases Metadata Management Metadata Registries Service Oriented Architectures Business Intelligence and Data Warehouse Semantic Web Dan Mc. Creary, President Dan Mc. Creary & Associates Metadata Strategy Development dan@danmccreary. com (952) 931 -9198 M D Copyright 2008 Dan Mc. Creary & Associates 25