SQL AND NOSQL ARE TWO SIDES OF THE

  • Slides: 21
Download presentation
SQL AND NOSQL ARE TWO SIDES OF THE SAME COIN Michael Rys, Microsoft Corp.

SQL AND NOSQL ARE TWO SIDES OF THE SAME COIN Michael Rys, Microsoft Corp. @SQLServer. Mike © 2012 Microsoft Strata 2012 Conference, March 2012

AGENDA • Scaling out your business is important! • No. SQL Paradigms and No.

AGENDA • Scaling out your business is important! • No. SQL Paradigms and No. SQL Platforms • SQL learns from No. SQL (with a demo of SQL Azure Federations) • No. SQL learns from SQL • Scalable Data Processing Platform of the Future

THE WEB 2. 0 BUSINESS ARCHITECTURE Attract Individual Consumers: - Provide interesting service -

THE WEB 2. 0 BUSINESS ARCHITECTURE Attract Individual Consumers: - Provide interesting service - Provide mobility - Provide social Monetize Individual: - Upsell service - VIP - Speed - Extra Capabilities Online Business Application Monetize the Social: - Improve individual experience - Re-sell Aggregate Data (e. g. , Advertisers)

SOCIAL NETWORKING: THE BUSINESS PROBLEM • 100 s of million of users • 10

SOCIAL NETWORKING: THE BUSINESS PROBLEM • 100 s of million of users • 10 s of million of users concurrently • Terabytes to petabytes of data • Structured and unstructured • Required (eventual) data consistency across users • E. g. show your updated state in your friends’ profile pages

SOLUTION • Shard/Partition user data across hundreds to thousands of SQL Databases • Propagate

SOLUTION • Shard/Partition user data across hundreds to thousands of SQL Databases • Propagate data changes from one DB to other DBs using reliable, async Message Service • Managing routes from each DB to every other DB would be too complex • Global Transactions would hinder scale and availability • Provide a caching layer for performance • And also used for o Clean-up state (e. g. on account close) o Deploy business logic (stored procedures)

EXAMPLE ARCHITECTURE 3001 -4000 1 -1000 TX 3 Async Message 2001 -3000 TX 4

EXAMPLE ARCHITECTURE 3001 -4000 1 -1000 TX 3 Async Message 2001 -3000 TX 4 4001 -5000 Service TX 1 TX 2 Dispatcher Async My DB gets updated Message Async Message I change my status user. Id=1024 1001 -2000 TX 5 5001 -6000 Data Tier Web Tier

MANY LARGE SCALE CUSTOMERS USING SIMILAR PATTERNS • Patterns • Sharding and reliable messaging

MANY LARGE SCALE CUSTOMERS USING SIMILAR PATTERNS • Patterns • Sharding and reliable messaging • Sharding and fan/out query layer • Caching layer • Customer Examples • • • Social Networking: Facebook, My. Space, etc Online electronic stores (cannot give names ) Travel reservation systems (e. g. Choice International) MSN Casual Gaming etc.

LESSONS LEARNED FROM THESE SCENARIOS • Require high availability • Be able to scale

LESSONS LEARNED FROM THESE SCENARIOS • Require high availability • Be able to scale out: • Functional and Data Partitioning Architecture • Provide scale-out processing: o Function shipping o Fanout and Map/Reduce processing • Be able to deal with failures: o Quorum o Retries o Eventual Consistency (similar to Read-consistent Snapshot Isolation) • Be able to quickly grow and change: • Elastic scale • Flexible, open schema • Multi-version schema support Move better support for these patterns into the Data Platform!

WHAT IS NOSQL ABOUT? • No. SQL = operational and developer agility at low

WHAT IS NOSQL ABOUT? • No. SQL = operational and developer agility at low Cap. Ex and Op. Ex! • Low Cost • Free Open Source Stores • Scale Cap. Ex cost below customer growth rate • Web friendlydeveloper model and tool chain • Processing Paradigms • • High Availability(scalable Replication, Fast Failover, DR/Geo. DR, tunable latency) Scale-out (Sharding, Map-Reduce, Elasticity) Performance (tuned for specific workloads, Caching, co-located compute with partitioned state) Tunable/Eventual Consistency • Data Model Paradigms • Data first: Flexible Schema • Low-impedance mismatchbetween programming and data model: o Key-Documents and Objects (BLOBS, JSON, XML, POJO) o Key-Wide Sparse Column Sets o Graphs (e. g. , RDF) • Range from devices, over OLTP Web 2. 0 applications to Big. Data Analytics

DATA MODELS Data Model Example Stores (apologies to the ones I didnot list) Simple

DATA MODELS Data Model Example Stores (apologies to the ones I didnot list) Simple Key-Value Pairs Memcache, Redis, Dynamo, Voldermort, Level. DB, Azure Caching Wide Sparse Column Sets Hyper. Table, Big Table, Cassandra, HBASE, Hyperbase, Amazon Dynamo. DB, Windows Azure Tables, SQL Server/Azure Sparse columns BLOBs Amazon S 3, Oracle Berkeley No. SQL, Windows Azure Blob Store, SQL Server RBS/File. Table JSON Documents Mongo. DB, Couch. Base, Riak, Raven. DB Graph Neo 4 J, Graph. DB, Hypergraph. DB, Stig, Intellidimension Objects and XML Documents Versant, Oracle Berkeley No. SQL, Mark. Logic, exist. DB, EMC Hive. DB, SQL Server/Azure, Oracle, IBM DB 2 Extended Relational Oracle, EMC SQLFire, IBM DB 2, My. SQL, Postgres, SQL Server/Azure

WHAT CAN SQL LEARN FROM NOSQL? • Low Cap. Ex, Low Op. Ex •

WHAT CAN SQL LEARN FROM NOSQL? • Low Cap. Ex, Low Op. Ex • Built-in tunable High-Availability • Data scale-out (Sharding) • Processing scale-out (Map-Reduce, Fan-Out, tunable consistency) • Flexible Data Models • JSON (& XML) support • Sparse columns/Column sets • Integrate with Big. Data Analytics (e. g. , Hadoop) Many Relational Database Systems are incorporating these learning!

EXAMPLE: SQL AZURE FEDERATIONS • • • Provides Data Partitioning/Sharding at the Data Platform

EXAMPLE: SQL AZURE FEDERATIONS • • • Provides Data Partitioning/Sharding at the Data Platform Enables applications to build elastic scale-out applications Provides non-blocking SPLIT/DROP for shards (MERGE to come later) Auto-connect to right shard based on sharding keyvalue Provides SPLIT resilient query mode

SQL AZURE FEDERATION CONCEPTS § § Federation - Represents the data being sharded Federation

SQL AZURE FEDERATION CONCEPTS § § Federation - Represents the data being sharded Federation Root - Database that logically houses federations, contains federation meta data § Federation Key - Value that determines the routing of a piece of data (defines a Federation Distribution) § Atomic Unit - All rows with the same federation key value: always together! § Federation Member (aka Shard) - A physical container for a set of federated tables for a specific key range and reference tables § member’s key range § Reference Table - Non-sharded table Sharded Application Connection Gateway Federated Table - Table that contains only atomic units for the Azure DB with Federation Root Federation Directories, Federation Users, Federation Distributions, … Federation “Orders_Fed” (Federation Key: Customer. ID) Member: PK [min, 100) AU PK=5 AU PK=25 AU PK=35 Member: PK [100, 488) AU PK=105 AU PK=235 AU PK=365 Member: PK [488, max) AU PK=555 AU PK=2545 AU PK=3565 16

DEMO MAP-REDUCE SCALE-OUT OVER SQL AZURE FEDERATIONS • Sharded Games. Info table using SQL

DEMO MAP-REDUCE SCALE-OUT OVER SQL AZURE FEDERATIONS • Sharded Games. Info table using SQL Azure Federations • Use a C# library that does implement a Map/Reduce processor on top SQL Azure Federations • Mapper and Reducer are specified using SQL 17

WHAT CAN NOSQL LEARN FROM SQL? • Flexible data is good, but: • Provide

WHAT CAN NOSQL LEARN FROM SQL? • Flexible data is good, but: • Provide optional schema in data platform to help with constraints and optimizations • Procedural Scale-Out processing is good, but: • Develop a declarative language suited for and across the data models (e. g. , co. SQL) • Standardize suitable abstractions and languages • Eventual Consistency is good, but: • Provide users the choice • Simple Queries are good, but: • Provide me with secondary indexes • it will be more efficient to join between two collections of JSON documents in the query engine than in the Application layer Many No. SQL Database Systems are starting to incorporate these learnings!

THE WEB 2. 0 BUSINESS ARCHITECTURE Attract Individual Consumers: - Provide interesting service -

THE WEB 2. 0 BUSINESS ARCHITECTURE Attract Individual Consumers: - Provide interesting service - Provide mobility - Provide social Monetize Individual: - Upsell service - VIP - Speed - Extra Capabilities Online Business Applicatio n Monetize the Social: - Improve individual experience - Re-sell Aggregate Data (e. g. , Advertisers)

SCALE-OUT DATA PLATFORM ARCHITECTURE Primary Shard Readabl e Replica OLTP Workloads Readabl e Replica

SCALE-OUT DATA PLATFORM ARCHITECTURE Primary Shard Readabl e Replica OLTP Workloads Readabl e Replica Highly Available High Scale High Flexibility Readabl e Replica mostly touching 1 to low number of shards Primary Shard Copy Traditional OLAP Workloads known schema Data warehouse, “Star joins” Dynamic OLAP Workloads Readabl e Replica 3 Vs (Volume, Velocity, Variety) Exploratory Readabl e Replica SQL or No. SQL Store Query Scale-out queries, often using eventual consistent scale-out frameworks like Hadoop

BIG DATA REQUIRES AN END-TO-END APPROACH 21

BIG DATA REQUIRES AN END-TO-END APPROACH 21

CALL TO ACTION • Familiarize yourself with the No. SQL genes in the Microsoft

CALL TO ACTION • Familiarize yourself with the No. SQL genes in the Microsoft Online Platform • Free 3 -Month Trial for Windows and SQL Azure: http: //www. windowsazure. com • Engage with us throughout Strata Presentation Speaker Do We Have the Tools We Need to Navigate the Dave Campbell New World of Data? Onsite Interview * Tim O’Reilly, Dave Campbell Unleash Insights on All Data With Microsoft Big Alexander Stojanovic Data Office Hours (Q&A session) Dave Campbell Hadoop + Javascript: What We Learned Asad Khan Democratizing BI at Microsoft: 40, 000 Users and Kirkland Barrett Counting Data Marketplaces For Your Extended Enterprise Piyush Lumba Date and Time 2/29 9: 00 am PST 2/29 10: 15 am PST 2/29 11: 30 am PST 2/29 1: 30 pm PST 2/29 2: 20 pm PST 3/1 10: 40 am PST 3/1 2: 20 pm PST • Download slides with additional information and related resources: http: //. . . . 22

APPENDIX 23

APPENDIX 23

RELATED RESOURCES • Scale-Out with SQL Databases • http: //gigaom. com/cloud/facebook-shares-some-secrets-on-making-mysql-scale/ • Windows Gaming

RELATED RESOURCES • Scale-Out with SQL Databases • http: //gigaom. com/cloud/facebook-shares-some-secrets-on-making-mysql-scale/ • Windows Gaming Experience Case Study: http: //www. microsoft. com/casestudies/Case_Study_Detail. aspx? Case. Study. ID=4000008310 • Scalable SQL: http: //cacm. org/magazines/2011/6/108663 -scalable-sql • http: //www. slideshare. net/Michael. Rys/scaling-with-sql-server-and-sql-azure-federations • No. SQL and the Windows Azure Platform • Whitepaper: http: //download. microsoft. com/download/9/E/9/9 E 9 F 240 D-0 EB 6 -472 E-B 4 DE 6 D 9 FCBB 505 DD/Windows%20 Azure%20 No%20 SQL%20 White%20 Paper. pdf • SQL Federation blog: http: //blogs. msdn. com/b/cbiyikoglu/archive/2011/03/03/nosql-genes-in-sql-azurefederations. aspx • Contact me • @SQLServer. Mike • http: //sqlblog. com/blogs/michael_rys/default. aspx