SCAPE 3 LDS Applying Preservation Principals to Linked

  • Slides: 42
Download presentation
SCAPE 3 LDS Applying Preservation Principals to Linked Data Systems David Tarrant @davetaz@ecs. soton.

SCAPE 3 LDS Applying Preservation Principals to Linked Data Systems David Tarrant @davetaz@ecs. soton. ac. uk Open Planets Foundation / University of Southampton i. Pres 2012 Toronto, October 2012 This work was partially supported by the SCAPE Project. The SCAPE project is co-funded by the European Union under FP 7 ICT-2009. 4. 1 (Grant Agreement number 270137).

SCAPE Present Day 2

SCAPE Present Day 2

SCAPE Presenting the REF The Results Evaluation Framework • 5 Tools (Droid, Fits, file,

SCAPE Presenting the REF The Results Evaluation Framework • 5 Tools (Droid, Fits, file, fido, Tika) • 65 Versions (from 2008 to now) • 1 Govdocs Corpora • 1 Question…. 3

SCAPE How accurate are file format identification tools historically? 4

SCAPE How accurate are file format identification tools historically? 4

PDF 1. 4 SCAPE 5

PDF 1. 4 SCAPE 5

DOCX SCAPE 6

DOCX SCAPE 6

SCAPE 9 Months Ago 7

SCAPE 9 Months Ago 7

SCAPE Why is Data Important? • • • Data and Metadata are knowledge. Knowledge

SCAPE Why is Data Important? • • • Data and Metadata are knowledge. Knowledge is power. Knowledge enables decision. Knowledge enables process. Knowledge empowers action. Knowledge enables us to say because… 8

SCAPE Processes DATA Decision Process A Classic Flow Chart Data is key to making

SCAPE Processes DATA Decision Process A Classic Flow Chart Data is key to making decisions 9

SCAPE Policy DATA Policy Process A Preservation Flow Chart Data is key to informing

SCAPE Policy DATA Policy Process A Preservation Flow Chart Data is key to informing policy 10

SCAPE Policy Data - Generated • When? • Who? • What it affects? •

SCAPE Policy Data - Generated • When? • Who? • What it affects? • What action is taken? Policy • Why? 11

SCAPE Why? • • When? Who? What it affects? What action is taken? •

SCAPE Why? • • When? Who? What it affects? What action is taken? • Why? DATA • Because something said so? DATA 12

SCAPE Case Study Example (Opinion) • Due to format obsolescence, all flash video files

SCAPE Case Study Example (Opinion) • Due to format obsolescence, all flash video files are to be migrated to H 264/AAC. • Input data: Study on proliferation of flash and evidence of lacking support from the rights holder, adobe. • File B was created from File A a year ago as it was identified as being a flash video file. • Today, File A is identified as being an ogg video file. • What has changed? Why? Does it affect me? Who generated the wrong information? Did they generate 13 any other wrong information?

SCAPE I Don’t Know! 14

SCAPE I Don’t Know! 14

SCAPE 6 Months Ago 15

SCAPE 6 Months Ago 15

SCAPE A Fact? File#1 has. Identification application/zip 16

SCAPE A Fact? File#1 has. Identification application/zip 16

SCAPE Provenance Tim Berners-Lee Provides 5 -Star Linked Data Guide • Tarrant, David and

SCAPE Provenance Tim Berners-Lee Provides 5 -Star Linked Data Guide • Tarrant, David and Carr, Leslie (2012) LDS 3: Applying Digital Preservation Principals to Linked Data Systems. In, Ninth International Conference on Digital Preservation (i. Pres 2012), Toronto, Canada 17

SCAPE Data!!! • One fact. • One document the fact comes from • One

SCAPE Data!!! • One fact. • One document the fact comes from • One citation about the documents place of publication. • Who, What, When and Where • Who they worked for and with. 18

SCAPE Named-Graph File#1 • In Linked Data a document is called a named-graph. has.

SCAPE Named-Graph File#1 • In Linked Data a document is called a named-graph. has. Identification Application/zip • But these also get used for two purposes!! 19

SCAPE The two uses of the named-graph No. 1 – Data Publication Named-Graph DATA

SCAPE The two uses of the named-graph No. 1 – Data Publication Named-Graph DATA has. Identification DATA File#1 Application/zip DATA 20

SCAPE The two uses of the named-graph No. 2 – Data Discovery/Query DATA Named-Graph

SCAPE The two uses of the named-graph No. 2 – Data Discovery/Query DATA Named-Graph File#1 DATA has. Identification application/zip File#1 has. Identification application/msword DATA 21

SCAPE The two uses of the named-graph No. 2 – Data Discovery/Query Named-Graph File#1

SCAPE The two uses of the named-graph No. 2 – Data Discovery/Query Named-Graph File#1 has. Identification Works For File#1 has. Identification Application/zip application/zip File#1 has. Identification Works For application/msword 22

Quads SCAPE Query Graph Source Graph 1 File#1 has. Identification application/zip Source Graph 2

Quads SCAPE Query Graph Source Graph 1 File#1 has. Identification application/zip Source Graph 2 File#1 has. Identification application/msword After all, RDF is a graph model RDF the spec, not the RDF/XML serialization 23

SCAPE Quads File 5. 04 Query Graph Source Graph 1 File#1 uses. Tool has.

SCAPE Quads File 5. 04 Query Graph Source Graph 1 File#1 uses. Tool has. Identification application/zip File 5. 07 Source Graph 2 File#1 uses. Tool has. Identification application/msword 24

SCAPE Still with me… • Ok so what about versioning? File 1/Identification/tool/file/version/5. 03 File#1

SCAPE Still with me… • Ok so what about versioning? File 1/Identification/tool/file/version/5. 03 File#1 has. Identification File 1/Identification/tool/file/version/5. 07 University of Southampton File#1 has. Identification application/msword 25

SCAPE Latest File 1/Identification/tool/file/version/5. 03 has. Identification /File 1/Identification/tool/file/ rsio previous ve n File#1

SCAPE Latest File 1/Identification/tool/file/version/5. 03 has. Identification /File 1/Identification/tool/file/ rsio previous ve n File#1 File 1/Identification/tool/file/version/5. 07 University of Southampton File#1 has. Identification application/msword 26

SCAPE 3 Months Ago 27

SCAPE 3 Months Ago 27

SCAPE www. LDS 3. org • A technical solution to all the complexity, automatic:

SCAPE www. LDS 3. org • A technical solution to all the complexity, automatic: • Versioning • Linking • Annotation • Named-Graph Management • Query Management 28

SCAPE Demo 29

SCAPE Demo 29

SCAPE www. LDS 3. org • CRUD • SWORDv 2 (Based Upon) • Oauth

SCAPE www. LDS 3. org • CRUD • SWORDv 2 (Based Upon) • Oauth Authentication 30

SCAPE In the paper • Links between P 2 -Registry, Pronom and LDS 3

SCAPE In the paper • Links between P 2 -Registry, Pronom and LDS 3 • Description of the LDS 3 specification • Overview of software in the LDS 3 stack (hardly any of it is new) • How LDS 3 relates to Amazon S 3 • More on named-graphs versioning • More on information and non-information resources. 31

SCAPE 2 Months Ago 32

SCAPE 2 Months Ago 32

SCAPE DEMO • http: //dev. lds 3. org/admin/timemachine. php? uri=htt p: //dev. lds 3.

SCAPE DEMO • http: //dev. lds 3. org/admin/timemachine. php? uri=htt p: //dev. lds 3. org/doc/B 1/E 3/7 F 01/8 ACE-43 BA-9 AA 9 B 708 B 7 A 20263 33

SCAPE 34

SCAPE 34

SCAPE Present Day 35

SCAPE Present Day 35

SCAPE Presenting the REF The Results Evaluation Framework • 5 Tools (Droid, Fits, file,

SCAPE Presenting the REF The Results Evaluation Framework • 5 Tools (Droid, Fits, file, fido, Tika) • 65 Versions (from 2008 to now) • 1 Govdocs Corpora • 1 Question…. 36

SCAPE How accurate are file format identification tools historically? 37

SCAPE How accurate are file format identification tools historically? 37

PDF 1. 4 SCAPE http: //data. openplanetsfoundation. org/ref/pdf_1. 4/ 38

PDF 1. 4 SCAPE http: //data. openplanetsfoundation. org/ref/pdf_1. 4/ 38

DOCX SCAPE http: //data. openplanetsfoundation. org/ref/docx/ 39

DOCX SCAPE http: //data. openplanetsfoundation. org/ref/docx/ 39

SCAPE Back To The Future 40

SCAPE Back To The Future 40

SCAPE The Future • Get me the identification for a file as it would

SCAPE The Future • Get me the identification for a file as it would have been on 3 rd October 2010. GET /ref/? query=“SELECT ? identificaiton where file = X” HTTP/1. 1 Accept-Datetime: Sun, 3 Oct 2010 12: 00 GMT Accept: text/plain application/zip 41

SCAPE 3 LDS Applying Preservation Principals to Linked Data Systems David Tarrant @davetaz@ecs. soton.

SCAPE 3 LDS Applying Preservation Principals to Linked Data Systems David Tarrant @davetaz@ecs. soton. ac. uk Open Planets Foundation / University of Southampton i. Pres 2012 Toronto, October 2012 This work was partially supported by the SCAPE Project. The SCAPE project is co-funded by the European Union under FP 7 ICT-2009. 4. 1 (Grant Agreement number 270137).