Carnegie Mellon Univ Dept of Computer Science 15

  • Slides: 25
Download presentation
Carnegie Mellon Univ. Dept. of Computer Science 15 -415 - Database Applications C. Faloutsos

Carnegie Mellon Univ. Dept. of Computer Science 15 -415 - Database Applications C. Faloutsos Storage and file structures Carnegie Mellon 15 -415 - C. Faloutsos

General Overview - rel. model • Relational model - SQL – Formal & commercial

General Overview - rel. model • Relational model - SQL – Formal & commercial query languages • • Functional Dependencies Normalization Physical Design Indexing Carnegie Mellon 15 -415 - C. Faloutsos 2

Overview of a DBMS casual user Naïve user DML precomp. DML parser DBA DDL

Overview of a DBMS casual user Naïve user DML precomp. DML parser DBA DDL parser trans. mgr buffer mgr Data-files Carnegie Mellon catalog 15 -415 - C. Faloutsos 3

Overview - detailed • storage and file structures – Disk characteristics – buffering –

Overview - detailed • storage and file structures – Disk characteristics – buffering – Sequential files – Storage of catalog Carnegie Mellon 15 -415 - C. Faloutsos 4

Disk • Seek time • Transfer time • Rotation delay About 2 -10 msec

Disk • Seek time • Transfer time • Rotation delay About 2 -10 msec vs micro/nano seconds for main memory Carnegie Mellon R/W head platter cylinder track 15 -415 - C. Faloutsos 5

Disk Sector (= block=page) R/W head platter cylinder track Carnegie Mellon 15 -415 -

Disk Sector (= block=page) R/W head platter cylinder track Carnegie Mellon 15 -415 - C. Faloutsos 6

Speed; $$ • • • Storage hierarchy Cache Main memory - random access; volatile

Speed; $$ • • • Storage hierarchy Cache Main memory - random access; volatile Magnetic disk – r. a. , non-volatile Optical disk / juke-boxes – r. a. , non-vol. Magnetic tape / tape juke-boxes – seq. access Carnegie Mellon 15 -415 - C. Faloutsos 7

Buffering • Buffer replacement policies: – LRU – MRU (when? ? ) • DB

Buffering • Buffer replacement policies: – LRU – MRU (when? ? ) • DB requirements: – Pin a page in m. m. (example? ) – Force a page out of m. m. (example? ) Carnegie Mellon 15 -415 - C. Faloutsos 8

File organization Eg. , ‘Student’ records – how would you store them on disk?

File organization Eg. , ‘Student’ records – how would you store them on disk? Data-files Carnegie Mellon 15 -415 - C. Faloutsos catalog 9

Fixed length records • Solution #1: ‘Heap’ (= no order) • Solution #2: Sequentially

Fixed length records • Solution #1: ‘Heap’ (= no order) • Solution #2: Sequentially – But: Deletions? Insertions? Carnegie Mellon 15 -415 - C. Faloutsos 10

Fixed length records • Sequentially • Deletions? Insertions? header Carnegie Mellon 15 -415 -

Fixed length records • Sequentially • Deletions? Insertions? header Carnegie Mellon 15 -415 - C. Faloutsos 11

Fixed length records Problems? header block Carnegie Mellon 15 -415 - C. Faloutsos 12

Fixed length records Problems? header block Carnegie Mellon 15 -415 - C. Faloutsos 12

Fixed length records Problems? Pointers crossing block boundaries – slow seq. scan! header block

Fixed length records Problems? Pointers crossing block boundaries – slow seq. scan! header block Carnegie Mellon 15 -415 - C. Faloutsos 13

Variable-length records Eg. , with VARCHAR fields: Address VARCHAR(100). Solutions? block Carnegie Mellon 15

Variable-length records Eg. , with VARCHAR fields: Address VARCHAR(100). Solutions? block Carnegie Mellon 15 -415 - C. Faloutsos 14

Variable-length records • Byte-streams (slotted page structure!) • fixed length (padding, overflow) Carnegie Mellon

Variable-length records • Byte-streams (slotted page structure!) • fixed length (padding, overflow) Carnegie Mellon 15 -415 - C. Faloutsos 15

Variable-length records • Byte-streams: end-of-record symbol • Rarely used (why? ) 123; smith; main

Variable-length records • Byte-streams: end-of-record symbol • Rarely used (why? ) 123; smith; main EOR 234; jones; forbes ave EOR Carnegie Mellon 15 -415 - C. Faloutsos 16

Slotted page structure (a great idea – ‘page-aware’!) - records can move within the

Slotted page structure (a great idea – ‘page-aware’!) - records can move within the page - start of page: has pointers - External pointers: point only to ‘ptrs’ ptrs … Free space rec 2 Carnegie Mellon External ptr page Rec 1 15 -415 - C. Faloutsos 17

Variable-length records – cont’d - Fixed length representations – how? Carnegie Mellon 15 -415

Variable-length records – cont’d - Fixed length representations – how? Carnegie Mellon 15 -415 - C. Faloutsos 18

Variable-length records – cont’d - Fixed length representations – how? - Padding - Anchor/overflow

Variable-length records – cont’d - Fixed length representations – how? - Padding - Anchor/overflow Carnegie Mellon 15 -415 - C. Faloutsos 19

File organizations - Heap (no ordering; one table per file) - Sequential (as discussed)

File organizations - Heap (no ordering; one table per file) - Sequential (as discussed) - Clustering (many tables per file) Carnegie Mellon 15 -415 - C. Faloutsos 20

Data dictionary storage • Stored as tables!! • Drill: E-R diagram? – Relations, attributes,

Data dictionary storage • Stored as tables!! • Drill: E-R diagram? – Relations, attributes, domains – Each relation has name, some attributes – Each attribute has name, length and domain – Also, views, integrity constraints, indices – User info (authorizations etc) – statistics Carnegie Mellon 15 -415 - C. Faloutsos 21

A-name 1 relation ha s position N attribute domain Carnegie Mellon 15 -415 -

A-name 1 relation ha s position N attribute domain Carnegie Mellon 15 -415 - C. Faloutsos 22

Data dictionary storage Tables? Sys-cat-schema (rel-name, #-attributes) Att-schema( att-name, rel-name, domain-type, position) User-schema( u-id,

Data dictionary storage Tables? Sys-cat-schema (rel-name, #-attributes) Att-schema( att-name, rel-name, domain-type, position) User-schema( u-id, g-id, passwd) Index-schema( i-name, rel-name, att-name, index-type) View-schema(v-name, definition) Carnegie Mellon 15 -415 - C. Faloutsos 23

Overview - conclusions • storage and file structures – Disk characteristics -> blocks; slow

Overview - conclusions • storage and file structures – Disk characteristics -> blocks; slow access – buffering – File organization: ‘slotted page structure’ – Storage of data dictionary: as tables! Carnegie Mellon 15 -415 - C. Faloutsos 24

Carnegie Mellon 15 -415 - C. Faloutsos 25

Carnegie Mellon 15 -415 - C. Faloutsos 25