Turning The Partialclosed World Assumption Upside Down Simon

Overview • Partial closed world assumption • Incompleteness as default (IAD) versus completeness as

Data semantics CWA OWA Data space Database is complete in some parts, in others

How can we model the partial-closed world assumption? Incompleteness as default (IAD) Describe complete

IAD is not well suited for DBs that are mostly complete IAD: Lists 9

1. How can we describe databases under CAD? Database schema: • student(name, degree) •

2. Translating from CAD to IAD Database schema • student(name, degree) • lecturer(name, faculty)

2. The cost of translation Result: CAD settings can be translated to IAD settings:

3. Query completeness reasoning: IAD instead of CAD, what’s the difference? Consider • QLogics(n)

3. Variants of completeness reasoning Input: • Query Q • Set of potential incompleteness

3. How complex is query completeness reasoning in the CAD setting? Set semantics IAD

Open questions • Which variants of reasoning are decidable/ what is their complexity? •

Slides: 13

Download presentation

Turning The Partial-closed World Assumption Upside Down Simon Razniewski, Ognjen Savkovic and Werner Nutt Free University of Bozen-Bolzano

Overview • Partial closed world assumption • Incompleteness as default (IAD) versus completeness as default (CAD) • Translating CAD to IAD • Query completeness reasoning under CAD

Data semantics CWA OWA Data space Database is complete in some parts, in others it is potentially incomplete Database is potentially incomplete Partial-closed world assumption (PCWA)

How can we model the partial-closed world assumption? Incompleteness as default (IAD) Describe complete parts Completeness as default (CAD) Describe incomplete parts

IAD is not well suited for DBs that are mostly complete IAD: Lists 9 complete parts CAD: Lists 3 potentially incomplete parts Questions: 1. How can we describe databases under CAD? 2. How can we translate from CAD to IAD? 3. Does using IAD instead of CAD make a difference?

1. How can we describe databases under CAD? Database schema: • student(name, degree) • lecturer(name, faculty) • takes(name, course) Formalism Inspired by. . from IAD More expressive Full table statements Closed predicates in description logics Example Pot. Inc(takes) Pattern statements Pot. Inc(takes(_, DB) ) Pattern completeness statements [Razniewski et al. , SIGMOD 2015] Query statements Query completeness statements [Motro, TODS 1989] Local statements Pot. Inc(takes(x, y); Local completeness student(x, CS) ) statements [Levy, VLDB 1996] Pot. Inc( Q(x): -takes(x, y), student(x, CS). )

2. Translating from CAD to IAD Database schema • student(name, degree) • lecturer(name, faculty) • takes(name, course) CAD: takes is potentially incomplete Other tables are complete by default = IAD: Complete(lecturer) and Complete(student) CAD: takes is potentially incomplete for records of CS students Other tables and rest of takes are complete by default = IAD: ?

2. The cost of translation Result: CAD settings can be translated to IAD settings: 1. For full table statements 2. For pattern statements, • if attribute domains are finite, or • using disequality in statements 3. For local and query statements, • using additionally negation in statements

3. Query completeness reasoning: IAD instead of CAD, what’s the difference? Consider • QLogics(n) : - student(n, c), takes(n, Logics) “Students that take logics” • Pot. Inc(takes(n, d); lecturer(n, f)) “Takes records of lecturers” Lecturers currently missing from the database might take Logics QLogics is not guaranteed to be complete. A query is complete if its certain answers are the same as its possible answers Completeness reasoning has been studied extensively in the IAD setting

3. Variants of completeness reasoning Input: • Query Q • Set of potential incompleteness statements C 1. Instance versus schema reasoning • Instance reasoning Q is complete wrt. C over database instance I iff Q(I)=Q(I’) for all C-valid extensions I’ of I • Schema Reasoning: Q is compl wrt. C iff Q is complete wrt. C over I for all database instances I 2. Query evaluation under bag or set semantics

3. How complex is query completeness reasoning in the CAD setting? Set semantics IAD Schema Reasoning Full-table PTIME statements Pattern statments NP-complete statments Instance Reasoning Bag semantics CAD Schema Reasoning PTIME Instance Reasoning IAD Schema Reasoning PTIME Instance Reasoning co. NPcomplete CAD Schema Reasoning PTIME Instance Reasoning co. NPcomplete Local NP-complete statements NP-complete co. NP-hard NP-complete PTIME co. NP-hard Query statements co. NP-hard NP-complete PTIME co. NP-hard ? ? Straightforward new results for IAD Existing results for IAD New tight results for CAD New bounds for CAD

Open questions • Which variants of reasoning are decidable/ what is their complexity? • Can we raise the expressiveness without increasing the complexity? • How can we reason with definite incompleteness? • “Some students are definitely missing” vs. “students may be missing”

PCWA CAD IAD Questions?