Open data sets from the EPO Linked data

  • Slides: 32
Download presentation
Open data sets from the EPO: Linked data, full-text data EU Datathon 2020 Martin

Open data sets from the EPO: Linked data, full-text data EU Datathon 2020 Martin Kracker European Patent Office 20 th March, 2020

Agenda § EPO and patent information § Linked open EP data European Patent Office

Agenda § EPO and patent information § Linked open EP data European Patent Office 2

A simple contract Confer exclusivity to the patent applicant European Patent Office Patents Reveal

A simple contract Confer exclusivity to the patent applicant European Patent Office Patents Reveal the invention to the public 3

What information do patent documents contain? § Bibliographic data • Title, abstract • Applicant,

What information do patent documents contain? § Bibliographic data • Title, abstract • Applicant, inventor, legal representative • Dates • Technical classifications • Links to other patents (forming “families”), like − Earlier filings (priorities, . . . ) − Citations § Text and images • Detailed description of invention • Claims, drawings European Patent Office 4

EPO’s Patent Information dissemination Human access EP full-text search EP Bulletin search European Publication

EPO’s Patent Information dissemination Human access EP full-text search EP Bulletin search European Publication Server Global Patent Index PATSTAT Online European Patent Register Espacenet Global Dossier Common Citation Document Computer access Web services Data products Open Patent Services European Publication Server EP (Linked Data, XML, PDF/A, EBD) worldwide (DOCDB, INPADOC) PATSTAT data European Patent Office 5

Agenda § EPO and Patent Information § Linked open EP data European Patent Office

Agenda § EPO and Patent Information § Linked open EP data European Patent Office 6

Key facts of “Linked open EP data” § Data product containing EP bibliographic data

Key facts of “Linked open EP data” § Data product containing EP bibliographic data and CPC scheme § Format: Linked data (aka Semantic Web) (RDF) § Open license, free-of-charge, updated weekly § Target user group: Patent-non-experts, web developers, data scientists § Launched: April 2018: epo. org/linked-data European Patent Office 7

Patent classification system (IPC, CPC) “Bottle opener” European Patent Office “Method of opening a

Patent classification system (IPC, CPC) “Bottle opener” European Patent Office “Method of opening a bottle” “Lift arrangement” “Cork lifter” “Cork remover” 8

Patent classification system (IPC, CPC) § They have a hierarchical structure; CPC: 250 000

Patent classification system (IPC, CPC) § They have a hierarchical structure; CPC: 250 000 symbols A Human necessities A 47 Furniture; Domestic articles and appliances, . . . A 47 J Kitchen equipment, coffee mills, spice mills, . . A 47 J 37 Baking, Roasting, Grilling, Frying A 47 J 37/06 Roasters, Grills, Sandwich grills A 47 J 37/08 Bread toasters A 47 J 37/0814 . . . with automatic ejection or timing means A 47 J 37/0821 . . . with mechanical clockwork timers European Patent Office 9

Linked Data: HTTP names as unique identifiers All business objects will get a an

Linked Data: HTTP names as unique identifiers All business objects will get a an HTTP name (URI) as globally unique identifier. Application identifier http: //data. epo. org/linked-data/id/application/EP/98925243 Publication identifier http: //data. epo. org/linked-data/publication/EP/1010425/A 1 In any web browser, each HTTP name will return some useful data in a standard format about that resource. It can also return relationships to other resources using their HTTP names. European Patent Office 10

Linked data: Just a (huge) collection of very simple facts Our patent world nr:

Linked data: Just a (huge) collection of very simple facts Our patent world nr: 1000000 inventor name: W. Kosman living in: NL office: EPO Linked Data model http: //data. epo. org/. . . /EP/1000000/A 1 is a publication. Number publication. Authority publication. "1000000". “EP”. http: //data. . . /vc/C 9 B 6819. . . . 6 B http: //data. . . /vc/C 9 B 6819. . 6 B is a fn country. Code person. "Kosman, W. ". "NL". http: //data. epo. org/. . . /EP/1000000/A 1 has inventor http: //data. . . /vc/C 9 B 6819. . 6 B European Patent Office 11

Linked data can be seen as a huge network (“graph”) priority KR applic ation

Linked data can be seen as a huge network (“graph”) priority KR applic ation KR Publi cation European Patent Office 12

Example with major classes and relationships Market / Value European Patent Office 13

Example with major classes and relationships Market / Value European Patent Office 13

Linked data can be seen as a huge network (“graph”) Technology trends European Patent

Linked data can be seen as a huge network (“graph”) Technology trends European Patent Office 14

Linked data can be seen as a huge network (“graph”) Value European Patent Office

Linked data can be seen as a huge network (“graph”) Value European Patent Office 15

Linked data can be seen as a huge network (“graph”) Competitor watch, Inventor identification

Linked data can be seen as a huge network (“graph”) Competitor watch, Inventor identification European Patent Office 16

The product page epo. org/linked-data European Patent Office 17

The product page epo. org/linked-data European Patent Office 17

API – Interactive features Simple browser for data exploration § Nice presentation of resources

API – Interactive features Simple browser for data exploration § Nice presentation of resources § Click to change focus European Patent Office 18

API – Parameterized URIs Linked data API § Retrieve one resource or list of

API – Parameterized URIs Linked data API § Retrieve one resource or list of resources § Filter § Sort § Define return format § Custom views European Patent Office 19

SPARQL queries Powerful query language § for RDF graphs § for heterogeneous data sets

SPARQL queries Powerful query language § for RDF graphs § for heterogeneous data sets § to explore data § to explore structure (meta-data) § federated queries § Sol. R text index European Patent Office 20

Don't forget: it is pure data <http: //data. epo. org/linked-data/publication/EP/1676702/B 1/-> rdfs: label "EP

Don't forget: it is pure data <http: //data. epo. org/linked-data/publication/EP/1676702/B 1/-> rdfs: label "EP 1676702 B 1" ; patent: application <http: //data. epo. org/linked-data/id/application/EP/05027699> ; patent: publication. Authority <http: //data. epo. org/linked-data/id/st 3/EP> ; patent: publication. Date "2008 -11 -26"^^xsd: date ; patent: publication. Kind_B 1 rdfs: label "B 1"@en. <http: //data. epo. org/linked-data/id/application/EP/01945281> patent: application. Number "01945281". patent: publication. Kind_A 1 European Patent Office rdfs: label "A 1"@en. 21

Download § about 650 mio triples § about 60 GB (N-triple format) § Updated

Download § about 650 mio triples § about 60 GB (N-triple format) § Updated weekly European Patent Office 22

Benefits of linked data for data consumers Target group: Data scientists, web developer, .

Benefits of linked data for data consumers Target group: Data scientists, web developer, . . . § Very simple data format: “triples” § Re-use of established ontologies (classes, properties) § Infrastructure and standards already exist: The Web and various W 3 C recommendations Less "data friction" when combining different data sets European Patent Office 23

Demo app epolod. org for Linked open EP data § A small web application

Demo app epolod. org for Linked open EP data § A small web application to show the power of combining 2 linked data sets: − EP linked open data − DBpedia ( = the linked data version of Wikipedia) § The app is for demonstration purpose only; it has not been designed as a full-blown search tool for patents European Patent Office 24

Demo app epolod. org: 1) Search publications Step 1: Retrieve patents using some search

Demo app epolod. org: 1) Search publications Step 1: Retrieve patents using some search criteria and select a single patent from the result list European Patent Office 25

Demo app epolod. org: 2) Create “related” keywords Step 2: The system extracts from

Demo app epolod. org: 2) Create “related” keywords Step 2: The system extracts from the abstract some keywords which are entries in the DBpedia encyclopaedia European Patent Office 26

Demo app epolod. org: 3) Entry in DBpedia Step 3: Starting from this entry,

Demo app epolod. org: 3) Entry in DBpedia Step 3: Starting from this entry, browse DBpedia by following references. Optionally: Follow any of the references and explore DBpedia European Patent Office 27

Patent information can add value to other data National statistics Company registers Geographical records

Patent information can add value to other data National statistics Company registers Geographical records Academic journals Patent data Dictionaries and encyclopaedias Trade mark data • Technical terms • Names (inventors, applicants) • Classifications • Date and numbers • Citations Court decisions Annual reports Technical magazines Telephone directories Classification data European Patent Office Economic data National patent data Library of Congress Image collections Standards Government subsidies 28

Linked Open Data cloud 2019 Linking Open Data cloud diagram, by Richard Cyganiak and

Linked Open Data cloud 2019 Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http: //lod-cloud. net/ European Patent Office 29

Possible use cases § Demo web application: epolod. org Combined EPO LOD data set

Possible use cases § Demo web application: epolod. org Combined EPO LOD data set with DBpedia (LD version of Wikipedia) § Combining with other data sets in Linked Data format or other data formats • CORDIS (EU projects) • Scientific literature (Springer Nature publishing house) • Environmental data (e. g. pollution) and “green patents” (CPC class “Y 02”) § Analysis / visualisation • Effects of policy decisions • … European Patent Office 30

Resources General Patent Information tour Catalog of PI products Discussion forums Helpdesk Questions: Martin

Resources General Patent Information tour Catalog of PI products Discussion forums Helpdesk Questions: Martin Kracker mkracker@epo. org http: //e-courses. epo. org/wbts/pi_tour/index. html epo. org/bulk-data, see “Overview data & tools” epo. org/forums epal@epo. org Linked open EP data Documentation epo. org/linked-data Webinar recordings epo. org/pi-videos European Patent Office 31

Thank you for your attention! Questions: mkracker@epo. org Martin Kracker European Patent Office Directorate

Thank you for your attention! Questions: mkracker@epo. org Martin Kracker European Patent Office Directorate Publication mkracker@epo. org European Patent Office 32