No SQL DBs S Sioutas CEIDUpatras Ion Stoica
No. SQL DBs S. Sioutas, CEID@Upatras Ion Stoica, http: //inst. eecs. berkeley. edu/~cs 162
No. SQL Today More recently: § The term has taken on different meanings § One common interpretation is “not only SQL” Most modern No. SQL systems diverge from the relational model or standard RDBMS functionality: The data model: relations documents tuples vs. attributes graphs key/values domains normalization The query model: relational algebra tuple calculus graph traversal vs. text search map/reduce The implementation: rigid schemas vs. flexible schemas (schema-less) ACID compliance vs. BASE (CAP: CA, AP, CP) In that sense, No. SQL today is more commonly meant to be something like “non-relational” 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 2
No. SQL Today (a partial, unrefined list) 4/25 Hbase Cassandra Hypertable Accumulo Amazon Simple. DB Sci. DB Stratosphere flare Cloudata Big. Table QD Technology Smart. Focus KDI Alterian Cloudera C-Store Vertica Qbase–Meta. Carta Open. Neptune HPCC Mongo DB Couch. DB Clusterpoint Server. Terrastore Jackrabbit Orient. DB Perservere Coud. Kit Djondb Schema. Free. DB SDB Raptor. DB Thru. DB Raven. DB Dynamo. DB Azure Table Storage Couchbase Server Riak Level. DB Chordless Genie. DB Scalaris Tokyo Kyoto Cabinet Tyrant Scalien Berkeley DB Voldemort Dynomite KAI Memcache. DB Faircom C-Tree Hamster. DB STSdb Tarantool/Box Maxtable Pincaster Raptor. DB TIBCO Active Spaces allegro-C ness. DBHyper. Dex Mnesia Light. Cloud Hibari Bang. DB Open. LDAP/MDB/Lightning Scality Redis Ka. Tree Tom. P 2 P Kumofs Treap. DB NMDB luxio actord Keyspace schema-free RAMCloud Sub. Record Mo 8 on. Db Dovetaildb JDBM Neo 4 j Infinite. Graph Sones Info. Grid Hyper. Graph. DB DEX Graph. Base Trinity Allegro. Graph Brightstar. DB Bigdata Meronymy Open. Link Virtuoso Vertex. DB Flock. DB Execom IOG Java Univ Netwrk/Graph Framework Open. RDF/Sesame Filament OWLim i. Graph Jena SPARQL Orient. Db Arango. DB Alchemy. DB Soft No. SQL Systems Db 4 o Versant Objectivity Starcounter ZODB Magma NEO siaqodb Sterling Morantex Eye. DB HSS Database Framer. D Ninja Database Pro Stupid. DB Kioku. DB Perl solution Durus Giga. Spaces Infinispan Queplix Grid. Gain Galaxy Space. Base Joafip. Coherence e. Xtreme. Scale Mark. Logic Server EMC Documentum x. DB e. Xist Sedna Base. X Qizx Berkeley DB XML Xindice Tamino Globals Intersystems Cache GT. M EGTM U 2 Open. Insight Reality Open. QM ESENT j. BASE Multi. Value e. Xtreme. DB RDM Embedded ISIS Family Prevayler Yserial Vmware v. Fabric Gem. Fire Btrieve Kirby. Base Tokutek Recutils File. DB Armadillo illuminate Correlation Database Fluid. DB Fleet DB Twisted Storage Rindo Sherpa tin Dryad Disco MUMPS Adabas XAP In-Memory Grid Oracle Big Data Appliance Network. X Pico. List Hazelcast Scale Mckoi. DDB Ion Stoica CS 162 e. Xtreme ©UCB Spring 2011 Innostore Fleet. DB No-List KDI Sky. Net Jas. DB Lotus/Domino Mckoi SQL Database Perst IODB Lec 24. 3
Primary No. SQL Categories • General Categories of No. SQL Systems: – Key/value store – (wide) Column store – Graph store – Document store (JSON: [java Script Object Notation] && XML documents) • Compared to the relational model: – Query models are not as developed. – Distinction between abstraction & implementation is not as clear. 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 4
Key/Value Store • “Dynamo: Amazon’s Highly Available Key-value Store, ” De. Candia, G. , et al. , SOSP’ 07, 21 st ACM • Symposium on Operating Systems Principles. • The basic data model: – Database is a collection of key/value pairs – The key for each pair is unique • Primary operations: – insert(key, value) – delete(key) – update(key, value) – lookup(key) No requirement for normalization (and consequently dependency preservation or lossless join) • Additional operations: – variations on the above, e. g. , reverse lookup (REVERSE INDEX) – iterators 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Dynamo. DB Azure Table Storage Riak Rdis Aerospike Foundation. DB Level. DB Berkeley DB Oracle No. SQL Database Genie. Db Bang. DB Chordless Scalaris Tokyo Cabinet/Tyrant Scalien Voldemort Dynomite KAI Memcache. DB Faircom C-Tree LSM Kitaro. DB Hamster. DB STSdb Tarantool. Box Maxtable Quasardb Pincaster Raptor. DB TIBCO Active Spaces Allegro-C ness. DB Hyper. Dex Shared. Hash. File Symas LMDB Sophia Pickle. DB Mnesia Light. Cloud Hibari Open. LDAP Genomu Binary. Rage Elliptics Dbreeze Rocks. DB Treode. DB (www. nosql-database. org www. db-engines. com www. wikipedia. com) Lec 24. 5
Wide Column Store • “Bigtable: A Distributed Storage System for Structured Data, ” Chang, F. , et al. , OSDI’ 06: Seventh Symposium on Operating System Design and implementation, 2006. • The basic data model: – Database is a collection of key/value pairs – Key consists of 3 parts – a row key, a column key, and a time-stamp (i. e. , the version) – Flexible schema - the set of columns is not fixed, and may differ from row-to-row • One last column detail: – Column key consists of two parts – a column family, and a qualifier Warning #1! 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Accumulo Amazon Simple. DB Big. Table Cassandra Cloudata Cloudera Druid Flink Hbase Hortonworks HPCC Hyupertable KAI KDI Map. R Monet. DB Open. Neptune Qbase Splice Machine Sqrrl (www. nosql-database. org www. db-engines. com www. wikipedia. com) Lec 24. 6
Wide Column Store Column families Row key Personal data ID First Name Last Name Date of Birth Job Category Professional data Salary Date of Hire Employer Column qualifiers 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 7
Wide Column Store Professional data Personal data ID First Name Last Name Date of Birth Job Category Salary Date of Hire ID First Name Middle Name Last Name Job Category Employer Hourly Rate ID First Name ID Last Name Job Category Salary Date of Hire Employer Group Employer Seniority Insurance ID Bldg # Office # Emergency Contact Medical data One “table” 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 8
Wide Column Store Row key t 1 t 0 ID First Name Last Name Date of Birth Job Category Personal data Salary Date of Hire Employer Professional data One “row” in a wide-column No. SQL database table = Many rows in several relations/tables in a relational database 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 9
Column or Row Stores? 4/25 column stores Hybrid Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 10 1
Id Ταινία Έτος 1 The Dark Knight 2008 Id Ηθοποιός Ημ. γέννησης 2 King’s Speech 2010 Ηθοπ Ταινία 1234 Christian Bale 30 -1 -1974 3 The Fighter 2010 1234 1 5678 Natalie Portman 9 -6 -1981 4 Black Swan 2010 1234 3 9012 Melissa Leo 14 -9 -1960 5 The Prestige 2006 1234 5 3456 Colin Firth 10 -9 -1960 5678 4 9012 3 3456 2 Key value The Dark Knight Ηθοποιοί: Christian Bale Έτος: 2008 King’s Speech Ηθοποιοί: Colin Firth Έτος: 2010 The Fighter Ηθοποιοί: Melissa Leo, Christian Bale Έτος: 2010 Black Swan Ηθοποιοί: Natalie Portman Έτος: 2010 The Prestige Ηθοποιοί: Christian Bale Έτος: 2006 Christian Bale Ημ. Γέννησης: 30 -1 -1974 Challenge: Indexing on keys (strings) of variable size? 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 11
f. e. Cassandra Row (Facebook, Netflix, spotify) • • 4/25 the value of a row is itself a sequence of key-value pairs such nested key-value pairs are columns key = column name a row must contain at least 1 column Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 12
Example of Columns 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 13
Comparing Cassandra (C*) and RDBMS • • • with RDBMS, a normalized data model is created without considering the exact queries o SQL can return almost anything though Joins with C*, the data model is designed for specific queries o schema is adjusted as new queries introduced C*: NO joins, relationships, or foreign keys a separate table is leveraged per query o data required by multiple tables is denormalized across those tables o 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 14
Cassandra Query Language - CQL • creating tables: CREATE TABLE users( email varchar, bio varchar, birthday timestamp, active boolean, time_posted)); PRIMARY KEY (email)); 4/25 CREATE TABLE tweets( email varchar, time_posted timestamp, tweet varchar, PRIMARY KEY (email, Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 15
Cassandra Query Language - CQL • inserting data INSERT INTO users (email, bio, birthday, active) VALUES (‘john. doe@bti 360. com’, ‘BT 360 Teammate’, 516513600000, true); ○ 4/25 ** timestamp fields are specified in milliseconds since epoch Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 16
Cassandra Query Language - CQL • querying tables • SELECT expression reads one or more records from Cassandra column family and returns a result-set of rows SELECT * FROM users; SELECT email FROM users WHERE active = true; 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 17
HBase Logical View (another example of Wide-Column DB) 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 18
HBase: Keys and Column Families Each record is divided into Column Families Each row has a Key Each column family consists of one or more Columns 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 19
Column family named “anchor” Column family named “Contents” • Key – Byte array – Serves as the primary key for the table – Indexed far fast lookup Column named “apache. com” • Column Family – Has a name (string) – Contains one or more related columns • Column – Belongs to one column family – Included inside the row » family. Name: column Name 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 20
Version number for each row • Version Number – Unique within each key – By default System’s timestamp – Data type is Long • Value (Cell) – Byte array 4/25 Ion Stoica CS 162 ©UCB Spring 2011 value Lec 24. 21
Notes on Data Model • HBase schema consists of several Tables • Each table consists of a set of Column Families – Columns are not part of the schema • HBase has Dynamic Columns – Because column names are encoded inside the cells – Different cells can have different columns “Roles” column family has different columns in 4/25 different cells Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 22
Example 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 23
Notes on Data Model (Cont’d) • The version number can be user-supplied – Even does not have to be inserted in increasing order – Version number are unique within each key • Table can be very sparse – Many cells are empty • Keys are indexed as the primary key 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Has two columns [cnnsi. com & my. look. ca] Lec 24. 24
HBase Physical Model • Each column family is stored in a separate file (called HTables) • Key & Version numbers are replicated with each column family • Empty cells are not stored HBase maintains a multi-level index on values: <key, column family, column name, timestamp> 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 25
HBase Regions • Each HTable (column family) is partitioned horizontally into regions – Regions are counterpart to HDFS blocks Each will be one region 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 26
Three Major Components • The HBase. Master – One master • The HRegion. Server – Many region servers • The HBase client 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 27
HBase Components • Region – A subset of a table’s rows, like horizontal range partitioning – Automatically done • Region. Server (many slaves) – Manages data regions – Serves data for reads and writes (using a log) • Master – Responsible for coordinating the slaves – Assigns regions, detects failures – Admin functions 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 28
Big Picture 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 29
Zoo. Keeper • HBase depends on Zoo. Keeper • By default HBase manages the Zoo. Keeper instance – E. g. , starts and stops Zoo. Keeper • HMaster and HRegion. Servers register themselves with Zoo. Keeper 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 30
Creating a Table HBase. Admin admin= new HBase. Admin(config); HColumn. Descriptor []column; column= new HColumn. Descriptor[2]; column[0]=new HColumn. Descriptor("column. Family 1: "); column[1]=new HColumn. Descriptor("column. Family 2: "); HTable. Descriptor desc= new HTable. Descriptor(Bytes. to. Bytes("My. Table")); desc. add. Family(column[0]); desc. add. Family(column[1]); admin. create. Table(desc); 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 31
Operations On Regions: Get() • Given a key return corresponding record • For each value return the highest version Get get = new Get(Bytes. to. Bytes("row 1")); Result r = htable. get(get); byte[] b = r. get. Value(Bytes. to. Bytes("cf"), Bytes. to. Bytes("attr")); // returns current version of value • 4/25 Can control the number of versions you want Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 32
Operations On Regions: Scan() 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 33
Select value from table where key=‘com. apache. www’ AND label=‘anchor: apache. com’ Get() Row key Time Stamp Column “anchor: ” t 12 “com. apache. www” t 11 t 10 “anchor: apache. com” “APACHE” t 9 “anchor: cnnsi. com” “CNN” t 8 “anchor: my. look. ca” “CNN. com” “com. cnn. www” t 6 t 5 t 3 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 34
Select value from table where anchor=‘cnnsi. com’ Scan() Row key Time Stamp Column “anchor: ” t 12 “com. apache. www” t 11 t 10 “anchor: apache. com” “APACHE” t 9 “anchor: cnnsi. com” “CNN” t 8 “anchor: my. look. ca” “CNN. com” “com. cnn. www” t 6 t 5 t 3 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 35
Operations On Regions: Put() • Insert a new record (with a new key), Or • Insert a record for an existing key Implicit version number (timestamp) Explicit version number 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 36
Operations On Regions: Delete() • Marking table cells as deleted • Multiple levels – Can mark an entire column family as deleted – Can make all column families of a given row as deleted • All operations are logged by the Region. Servers • The log is flushed periodically 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 37
HBase: Joins • HBase does not support joins • Can be done in the application layer – Using scan() and get() operations 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 38
Altering a Table Disable the table before changing the schema 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 39
Need for High-Level Languages • Hadoop is great for large-data processing! – But writing Java programs for everything is verbose and slow – Not everyone wants to (or can) write Java code • Solution: develop higher-level data processing languages – Hive: HQL is like SQL – Pig: Pig Latin is a bit like Perl 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 40
Hive and Pig • Hive: data warehousing application in Hadoop – Query language is HQL, variant of SQL – Tables stored on HDFS as flat files – Developed by Facebook, now open source • Pig: large-scale data processing system – – – Scripts are written in Pig Latin, a dataflow language Developed by Yahoo!, now open source Roughly 1/3 of all Yahoo! internal jobs Dataflow architectures do not have a program counter. The executability and execution of instructions is solely determined based on the availability of input arguments to the instructions, so that the order of instruction execution is unpredictable: i. e. behavior is indeterministic. • Common idea: – Provide higher-level language to facilitate large-data processing – Higher-level language “compiles down” to Hadoop jobs 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 41
4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 42
Graph Store • Neo 4 j - “The Neo Database – A Technology Introduction, ” 2006. • The basic data model: – Directed graphs – Nodes & edges, with properties, i. e. , “labels” 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Allegro. Graph Arango. DB Bigdata Bitsy Brightstar. DB DEX/Sparksee Execom IOG Fallen * Filament Flock. DB Graph. Base Graphd Horton Hyper. Graph. DB IBM System G Native Store Infinite. Graph Info. Grid j. Core. DB Graph Map. Graph Meronymy Neo 4 j Orly Open. Link virtuoso Oracle Spatial and Graph Oracle No. SQL Datbase Orient. DB OQGraph Ontotext OWLIM R 2 DF ROIS Sones Graph. DB SPARQLCity Sqrrl Enterprise Stardog Teradata Aster Titan Trinity Triple. Bit Velocity. Graph Vertex. DB White. DB (www. nosql-database. org www. db-engines. com www. wikipedia. com) Lec 24. 43
What is Cypher? • Declarative graph pattern matching language – “SQL for graphs” – Tabular results • Cypher is evolving steadily – Syntax changes between releases • Supports queries – Including aggregation, ordering and limits – Mutating operations in product roadmap 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 44
Where and Aggregation • Aggregation: COUNT, SUM, AVG, MAX, MIN, COLLECT • Where clauses: start doctor=node: characters(name = 'Doctor') match (doctor)<-[: PLAYED]-(actor)-[: APPEARED_IN]->(episode) where actor = 'Tom Baker' and episode. title =~ /. *Dalek. */ return episode. title • Ordering: order by <property> desc 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 45
Document Store • Mongo. DB - “How a Database Can Make Your Organization Faster, Better, Leaner, ” February 2015. The basic data model: n The general notion of a document – words, phrases, sentences, paragraphs, sections, subsections, footnotes, etc. n Flexible schema – subcomponent structure may be nested, and vary from document-to-document. n Metadata – title, author, date, embedded tags, etc. n Key/identifier. One implementation detail: n Formats vary greatly – PDF, XML, JSON, BSON, plain text, various binary, scanned image. 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Amisa. DB Arango. DB Base. X Cassandra Cloudant Clusterpoint Couchbase Couch. DB Densodb Djondb EJDB Elasticsearch e. Xist Fleet. DB i. Box. DB Inquire Jas. DB Mark. Logic Mongo. DB MUMPS Ne. DB No. SQL embedded db Orient. DB Raptor. DB Raven. DB Rethink. DB Siso. DB Terrastore Thru. DB (www. nosql-database. org www. db-engines. com www. wikipedia. com) Lec 24. 46
RDBMS Database ➜ Database Table ➜ Collection Row ➜ Document Index ➜ Index Join ➜ Embedded Document ➜ Reference Foreign Key 4/25 Mongo. DB Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 47
There are some patterns • Embedding • Linking 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 48
Embedding & Linking 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 49
JSON “Java. Script Object Notation” Easy for humans to write/read, easy for computers to parse/generate Objects can be nested Built on name/value pairs Ordered list of values http: //json. org/ 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 50
BSON • • • “Binary JSON” Binary-encoded serialization of JSON-like docs Also allows “referencing” Embedded structure reduces need for joins Goals – Lightweight – Traversable – Efficient (decoding and encoding) http: //bsonspec. org/ 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 51
BSON Example { "_id" : "37010" "city" : "ADAMS", "pop" : 2660, "state" : "TN", “councilman” : { name: “John Smith” address: “ 13 Scenic Way” } } 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 52
CRUD Query Language Create, Read, Update, Delete 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 53
CRUD: Using the Shell To insert documents into a collection/make a new collection: db. <collection>. insert(<document>) <=> INSERT INTO <table> VALUES(<attributevalues>); 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 54
CRUD: Inserting Data Insert one document db. <collection>. insert({<field>: <value>}) Inserting a document with a field name new to the collection is inherently supported by the BSON model. To insert multiple documents, use an array. 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 55
CRUD: Querying Done on collections. Get all docs: db. <collection>. find() Returns a cursor, which is iterated over shell to display first 20 results. Add. limit(<number>) to limit results SELECT * FROM <table>; Get one doc: db. <collection>. find. One() 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 56
CRUD: Querying To match a specific value: db. <collection>. find({<field>: <value>}) “AND” db. <collection>. find({<field 1>: <value 1>, <field 2>: <value 2> }) SELECT * FROM <table> WHERE <field 1> = <value 1> AND <field 2> = <value 2>; 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 57
CRUD: Querying OR db. <collection>. find({ $or: [ <field>: <value 1> <field>: <value 2> ] }) SELECT * FROM <table> WHERE <field> = <value 1> OR <field> = <value 2>; Checking for multiple values of same field db. <collection>. find({<field>: {$in [<value>, <value>]}}) 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 58
CRUD: Querying Including/excluding document fields db. <collection>. find({<field 1>: <value>}, {<field 2>: 0}) SELECT field 1 FROM <table>; db. <collection>. find({<field>: <value>}, {<field 2>: 1}) Find documents with or w/o field db. <collection>. find({<field>: { $exists: true}}) 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 59
CRUD: Updating db. <collection>. update( {<field 1>: <value 1>}, //all docs in which field = value {$set: {<field 2>: <value 2>}}, //set field to value {multi: true} ) //update multiple docs upsert: if true, creates a new doc when none matches search criteria. UPDATE <table> SET <field 2> = <value 2> WHERE <field 1> = <value 1>; 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 60
CRUD: Updating To remove a field db. <collection>. update({<field>: <value>}, { $unset: { <field>: 1}}) Replace all field-value pairs db. <collection>. update({<field>: <value>}, { <field>: <value>, <field>: <value>}) *NOTE: This overwrites ALL the contents of a document, even removing fields. 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 61
CRUD: Removal Remove all records where field = value db. <collection>. remove({<field>: <value>}) DELETE FROM <table> WHERE <field> = <value>; As above, but only remove first document db. <collection>. remove({<field>: <value>}, true) 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 62
ACID vs. BASE • Database systems traditionally support ACID requirements: – Atomicity, Consistency, Isolation, Durability • In a distributed web applications the focus shifts to: – Consistency, Availability, Partition tolerance • CAP theorem - At most two of the above can be enforced at any given time. – Conjecture – Eric Brewer, ACM Symposium on the Principles of Distributed Computing, 2000. – Proved – Seth Gilbert & Nancy Lynch, ACM SIGACT News, 2002. • Reducing consistency, at least temporarily, maintains the other two. 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 63
ACID vs. BASE • Thus, distributed No. SQL systems are typically said to support some form of BASE: – Basic Availability – Soft state – Eventual consistency* • “We’d really like everything to be structured, consistent and harmonious, …, but what we are faced with is a little bit of punk-style anarchy. And actually, whilst it might scare our grandmothers, it’s OK. . . ” • -Julian Browne • 4/25 https: //www. youtube. com/watch? v=p. Oe 9 PJrbo 0 s Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 64
4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 65
4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 66
END 4/25 Ion Stoica CS 162 ©UCB Spring 2011 Lec 24. 67
- Slides: 67