Principles of Data Management Syllabus Intro v Course
Principles of Data Management Syllabus & Intro
v Course website: § https: //cis. temple. edu/~edragut/5516/Spr 20/sched ule. htm Text Book(s) v Workload v Intended Schedule v Projects v § Grading v Reading List
One way to view at the scope of this course
Here Are Some DB Questions
How about This Question?
Wait. What? I am suppose to know that already?
So, What Is This Class About Then?
Having Some Fun. .
What Is a DBMS? A Database Management System (DBMS) is a software package designed to store and manage data. v A very large, integrated collection of data. v Models real-world enterprise. v § § Entities (e. g. , students, courses) Relationships (e. g. , Madonna is taking CS 5516)
Files vs. DBMS Application must stage large datasets between main memory and secondary storage (e. g. , buffering, page-oriented access, 32 -bit addressing, etc. ) v Special code for different queries v Must protect data from inconsistency due to multiple concurrent users v Crash recovery v Security and access control v
Why Use a DBMS? Data independence and efficient access. v Reduced application development time. v Data integrity and security. v Uniform data administration. v Concurrent access, recovery from crashes. v
Why Study Databases? ? v Shift from computation to information § § v at the “low end”: scramble to webspace (a mess!) at the “high end”: scientific applications Datasets increasing in diversity and volume. § § v ? Digital libraries, interactive video, Human Genome project, EOS project. . . need for DBMS exploding DBMS encompasses most of CS § OS, languages, theory, AI, multimedia, logic
A Brief DB History v Early 1970 s § Many database systems § Incompatible, exposing many implementation details v Then Ted Codd came along § Relational model § and… v Donald D. Chamberlin and Raymond F. Boyce § Structured Query Language (SQL) § Implementation differences became irrelevant § A few major DB systems dominated the market
A Brief DB History v James ("Jim") Nicholas Gray § Transactions and More Transactions (ACID) § System R v Michael Stonebraker § INGRES, Postgres INGRES and System R together helped to turn relational systems from a laboratory curiosity into the default choice for even the most demanding data processing applications.
Then Web 2. 0 & 3. 0, Big Data Happen v What do you think happen? § Semi-structured data happen. • A lot of it and in many forms…
Some Facts about Web x. 0 and Big Data v v v Twitter: 255 million monthly active users and 500 million Tweets are sent per day, Facebook: over 1 billion monthly users and faces 3 million message per 20 minute Instagram: 200 Million Monthly Active Users and 1. 6 Billion Likes and 60 Million Photos shared every day
Database Systems Landscape Nowadays
Somebody, Please, Bring Some Order to This Madness
Somebody, Please, Bring Some Order to This Madness – Cont’d v No. SQL Databases
Somebody, Please, Bring Some Order to This Madness v v Different Interfaces Different hardware support Different application support Lack of Uniformity Source: http: //www. infoq. com/articles/State-of-No. SQL
And 2016…
Database Evolution Timeline
Additional Resources Tutorial by C. Mohan, An In-Depth Look at Modern Database Systems v https: //docs. google. com/file/d/0 B 7 l. NUaak 0 b. K 1 encw. Yn. BVUWZSWj. A/edit v
Relational Data v Tables or Relations
Relational Database: Schemas
Relational Database: Query Language v SQL - Structured Query Language § a declarative language designed for managing data held in a relational database management system • Tell what you want and from where • Do not tell: how to get the data
Key-Value Store v Implemented as an associative array, map, symbol table, or dictionary abstract data type composed of a collection of (key, value) pairs such that each possible key appears at most once in the collection. A simple put/get interface v Great properties: scalability, availability, reliability v
Key-Value Store Usage Scenarios v Increasingly popular within data centers and in P 2 P Data center amazon. com Linked. In Facebook Dynamo Voldemort Cassandra P 2 P Vuze u. Torrent Vuze DHT u. Torrent DHT
Row Store and Column Store Source: Column-Oriented Database Systems, VLDB 2009. Tutorial; S. Harizopoulos, D. Abadi, P. Boncz v v v In row store data are stored in the disk tuple by tuple. Where in column store data are stored in the disk column by column. Column-stores are more I/O efficient for read-only queries as they read only those attributes which are accessed by a query.
Row Store and Column Store Row Store Column Store (+) Easy to add/modify a record (+) Only need to read in relevant data (-) Might read in unnecessary data v (-) Tuple writes require multiple accesses So column stores are suitable for read-mostly, readintensive, large data repositories
Graph Databases Biological Network Ecological Network Social Network Chemical Network Program Flow Web Graph
Graph Databases: Query v Find all the restaurants my friends (in Facebook) like
So, Why Study Relational DBs? v Jack Clark, The Register, 30 August 2013: “The tech world is turning back toward SQL, bringing to a close a possibly misspent half-decade in which startups courted developers with promises of infinite scalability and the finest imitation-Google tools available, and companies found themselves exposed to unstable data and poor guarantees. ” v Google Spanner paper, October 2012: “We believe it is better to have application programmers deal with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions. ” v Sean Doherty in Wired, September 2013: “But don’t become unnecessarily distracted by the shiny, new-fangled, No. SQL red buttons just yet. Relational databases may not be hot or sexy but for your important data there is no substitute. ”
And, The Key Reason of All v Gartner estimates RDBMS market at $26 B with about 9% annual growth, whereas Market Research Media Ltd expects No. SQL market to be at $3. 5 B by 2018. § Source: C Mohan’s tutorial § Can someone check it!
Databases make these folks happy. . . End users and DBMS vendors v DB application programmers v § v E. g. , smart webmasters Database administrator (DBA) § § Designs logical /physical schemas Handles security and authorization Data availability, crash recovery Database tuning as needs evolve Must understand how a DBMS works!
Summary DBMS used to maintain, query large datasets. v Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security. v Levels of abstraction give data independence. v A DBMS typically has a layered architecture. v DBAs hold responsible jobs and are well-paid! v DBMS R&D is one of the broadest, most exciting areas in CS. v
- Slides: 37