Data will grow to 44 ZB in 2020
Data will grow to 44 ZB in 2020 Today, 80% of organizations adopt cloud-first strategies AI investment increased by 300%
Cloud Data Warehouse Big Data Modern Data Warehouse Advanced Analytics on Big Data Real-Time Advanced Analytics
INGEST Business/custom apps (Structured) STORE Azure Data Factory PREP & TRAIN Polybase Azure Blob Storage MODEL & SERVE Azure SQL Data Warehouse Azure Analysis Services Analytical dashboards
INGEST STORE PREP & TRAIN MODEL & SERVE Azure HDInsight Logs, files and media (unstructured) Azure Data Factory Azure Databricks Azure Blob Storage Hot Path Polybase Business/custom apps (Structured) Azure Data Factory Azure SQL Data Warehouse Azure Analysis Services Operational Reports & Analytical Dashboards (Power. BI)
INGEST STORE PREP & TRAIN MODEL & SERVE Azure Machine Learning SQL Machine Learning Logs, files and media (unstructured) Azure Data Factory Azure HDInsight Azure Blob Storage Azure Databricks Hot Path Polybase Business/custom apps (Structured) Azure Data Factory Azure SQL Data Warehouse Azure Analysis Services Operational Reports & Analytical Dashboards (Power. BI)
INGEST Logs, files and media (unstructured) STORE PREP & TRAIN MODEL & SERVE Azure Machine Learning SQL Machine Learning Azure Io. T Hub Azure HDInsight (Kafka) Azure HDInsight Sensors and Io. T (unstructured) Azure Databricks Hot Path Polybase Business/custom apps (Structured) Azure Data Factory Azure Blob Storage Azure SQL Data Warehouse Azure Analysis Services Operational Reports & Analytical Dashboards (Power. BI)
Compute $$$ Storage
Control Compute Remote Storage
Control Compute Remote Storage
Control Compute Remote Storage
Remote Storage Compute Control Cores Memory SSD Temp. DB Cores Memory Cores SSD Temp. DB Memory SSD Temp. DB Snapshot backups Data Log
Control Compute Remote Storage Intelligent Cache
Control Cores SSD Temp. DB Compute Cores Remote storage Memory Cores NVMe SSD Cache Memory Cores NVMe SSD Temp. DB Cache Snapshot backups Memory NVMe SSD Temp. DB Cache Data Log Temp. DB
Memory Cache ∞ Remote Storage
Latest hardware Hashed data maintained on the compute nodes
Generation 1 Generation 2 ∞ 1400 1200 1000 800 600 400 200 0 Max Capacity Compressed Raw Row Storage Compressed Columnar Raw
Generation 1 Generation 2 $200. 00 $180. 00 $160. 00 $140. 00 $120. 00 $100. 00 $80. 00 $60. 00 $40. 00 $20. 00 $0. 00 100 200 300 400 500 600 1000 1200 1500 2000 3000 6000 $0. 00 1000 1500 2000 2500 3000 5000 6000 7500 10000 15000 30000
DWU 100 200 300 400 500 600 1000 1200 1500 2000 3000 6000
Find consumption patterns and leverage Azure Functions to • Auto-scale • Time-scale And save money Coming soon: template to auto/time-scale DW (can apply to SQL DB)
Recommended starting point Flexibility to select any range of DWUs >160 TBs 80 -160 TBs 60 -80 TBs 48 -60 TBs 36 -48 TBs 20 -36 TBs 16 -20 TBs 12 -16 TBs 8 -12 TBs 4 -8 TBs 0 -4 TBs 100 200 300 400 500 600 1000 1200 1500 2000 3000 6000
ALTER DATABASE sqltelemetry MODIFY (service_objective = 'DW 1000' ) ;
SQL DW Architecture Application or User connection Control – “The Brain” Connection and tool endpoint. Coordinates storage/compute activity. Control Engine Data Loading DMS (ADF, SSIS, REST, OLE, ODBC, SQL DB ADF, AZCopy, Power. Shell) Compute DMS DMS SQL DB Dist_DB_16 Dist_DB_17 Dist_DB_31 Dist_DB_32 Dist_DB_46 Dist_DB_47 Dist_DB_1 Dist_DB_2 Dist_DB_45 Blob storage [WASB(S)] … Dist_DB_30 … … … Dist_DB_15 Dist_DB_60 Compute – “The Brawn” Handles query processing, ability to scale up/down Data Movement Services Coordinates data movement between nodes/storage Storage Database data and log files are stored on WASB separate from compute
CONTROL Queries Engine 0. Batch (batch, sp) 1. Query submitted Shell DB Submit Time DMS 2. Parsed 3. Optimized – MEMO generated, plan chosen, check some perms Start Time 4. Object locks acquired End Compile Time Compute DMS DMS SQL DB Dist_DB_16 Dist_DB_17 Dist_DB_31 Dist_DB_32 Dist_DB_46 Dist_DB_47 End Time Dist_DB_30 Dist_DB_45 … Dist_DB_15 … 6. Executed … Dist_DB_1 Dist_DB_2 … 5. System locks acquired – wait for concurrency slot Dist_DB_60
SELECT FROM ; COUNT_BIG(*) dbo. [Fact. Internet. Sales] SELECT FROM ; SUM(*) dbo. [Fact. Internet. Sales] Control Compute SELECT FROM ; COUNT_BIG(*) dbo. [Fact. Internet. Sales]
- Slides: 33