INGEST Onpremises data Cloud data Saa S data

  • Slides: 50
Download presentation

INGEST On-premises data Cloud data Saa. S data STORE PREPARE TRANSFORM & ENRICH SERVE

INGEST On-premises data Cloud data Saa. S data STORE PREPARE TRANSFORM & ENRICH SERVE VISUALIZE

Azure SQL Data Warehouse Best in class price performance Industry-leading security Intelligent workload management

Azure SQL Data Warehouse Best in class price performance Industry-leading security Intelligent workload management Data flexibility Developer productivity Up to 94% less expensive than competitors Defense-in-depth security and 99. 9% financially backed availability SLA Prioritize resources for the most valuable workloads Ingest variety of data sources to derive the maximum benefit Use preferred tooling for SQL data warehouse development

Power BI Import Direct. Query Composite Models & Aggregation Tables Great for small data

Power BI Import Direct. Query Composite Models & Aggregation Tables Great for small data sources and personal data discovery The enterprise solution Why choose? Import and Direct. Query in a single model Fine for CSV files, spreadsheet data and summarized OLTP data Avoid data movement Delegate query work to the back-end source; take advantage of Azure SQL Data Warehouse’s advanced features Keep summarized data local; get detail data from the source

Azure SQL Data Warehouse Best in class price performance Industry-leading security Intelligent workload management

Azure SQL Data Warehouse Best in class price performance Industry-leading security Intelligent workload management Data flexibility Developer productivity Up to 94% less expensive than competitors Defense-in-depth security and 99. 9% financially backed availability SLA Prioritize resources for the most valuable workloads Ingest variety of data sources to derive the maximum benefit Use preferred tooling for SQL data warehouse development

Complete Security CATEGORY FE ATUR E Data Protection Data In Transit Yes Data encryption

Complete Security CATEGORY FE ATUR E Data Protection Data In Transit Yes Data encryption at rest (Service & User Managed Keys) Yes Data Discovery and Classification Yes Native Row Level Security* Yes Table and View Security (GRANT / DENY) Yes Column Level Security Yes Dynamic Data Masking* Yes SQL Authentication Yes Native Azure Active Directory Yes Integrated Security Yes Multi-Factor Authentication Yes Virtual Network (VNET) Yes SQL Firewall (server) Yes Integration with Express. Route Yes SQL Threat Detection Yes SQL Auditing Yes Vulnerability Assessment Yes SQL DA TA WA REHOUSE * Booth Demo’s

Scheduler Without Importance Workload Management Workload Importance CEO CEO 1 2 3 9 4

Scheduler Without Importance Workload Management Workload Importance CEO CEO 1 2 3 9 4 5 10 Running 6 7 8 10 9 11 10 12 11 12 Queued By default, workloads are run on a first-in first out basis.

CREATE WORKLOAD CLASSIFIER classifier_name WITH ( WORKLOAD_GROUP = 'name’ , MEMBERNAME = 'security_account' [

CREATE WORKLOAD CLASSIFIER classifier_name WITH ( WORKLOAD_GROUP = 'name’ , MEMBERNAME = 'security_account' [ [ , ] IMPORTANCE = { LOW | BELOW_NORMAL | NORMAL (default) | ABOVE_NORMAL | HIGH }]) Workload Management Scheduler With Importance Turned On CEO 1 2 3 12 CEO 4 5 6 7 8 9 10 11 12 Normal Low Normal High Workload Importance Running Queued

Querying JSON select json_value (jsontext, '$. name') as name, json_value (jsontext, '$. phone') as

Querying JSON select json_value (jsontext, '$. name') as name, json_value (jsontext, '$. phone') as phone, json_value (jsontext, '$. address') as address from friends

Dynamic Data Masking

Dynamic Data Masking

Application Lifecycle Management

Application Lifecycle Management

What does a Petabyte look like? Table Name Records Customer 150, 000, 000 Lineitem

What does a Petabyte look like? Table Name Records Customer 150, 000, 000 Lineitem 6, 000, 005, 796 Nation 25 Orders 1, 500, 000, 000 Part 200, 000, 000 Partsupp 800, 000, 000 Region 5 Supplier 10, 000, 000

Scaling to Petabytes Elastic Architecture Columnar Storage Columnar Ordering Table Partitioning Nonclustered Indexes Hash

Scaling to Petabytes Elastic Architecture Columnar Storage Columnar Ordering Table Partitioning Nonclustered Indexes Hash Distribution Materialized Views Resultset Cache

Scalability

Scalability

Elastic Architecture Compute Hyper-scale Platform Separation of Compute and Storage Scale Up/Down – Pause,

Elastic Architecture Compute Hyper-scale Platform Separation of Compute and Storage Scale Up/Down – Pause, Resume Multi-tier adaptive caching Hardware accelerated networking Storage

Customer. Key 120934 Customer. Key Minimum > 120934 Maximum > 567823 Scaling to Petabytes

Customer. Key 120934 Customer. Key Minimum > 120934 Maximum > 567823 Scaling to Petabytes 293402 Columnar storage Highest compression Fast query execution 569462

Product. Key Amount 2019 -02 -01 Product. Key Scaling to Petabytes Transaction. Date >

Product. Key Amount 2019 -02 -01 Product. Key Scaling to Petabytes Transaction. Date > 2019 -02 -01 Table Partitioning Range predicate elimination Efficient data management 2019 -02 -08 Amount

Elastic Scaleout compute Elastic Compute Bottomless Storage Scaling to Petabytes Hash Distribution Collocated optimizations

Elastic Scaleout compute Elastic Compute Bottomless Storage Scaling to Petabytes Hash Distribution Collocated optimizations Physical scale-out Inherits columnar ordering, partitioning, secondary indexes Elastic Scale-out columnar tables

Scaling to Petabytes Table DDL Columnstore Index Table Partitioning Table Distribution CREATE TABLE [dbo].

Scaling to Petabytes Table DDL Columnstore Index Table Partitioning Table Distribution CREATE TABLE [dbo]. [lineitem] ( [l_shipdate] date, [l_orderkey] int ) WITH ( CLUSTERED COLUMNSTORE INDEX, PARTITION ([l_shipdate] RANGE RIGHT FOR VALUES (N'1998 -10 -01', N'1998 -11 -01')), DISTRIBUTION = HASH ( [l_orderkey] )) GO

Product. Key Amount 2430 Product. Key Minimum > 120934 1 Maximum > 567823 2102

Product. Key Amount 2430 Product. Key Minimum > 120934 1 Maximum > 567823 2102 Scaling to Petabytes Columnar ordering Efficient segment elimination Improved Data Compression Amount 2102 43. 67 5423 623. 86 293402 5427 569462 7230 7320 9435

Scaling to Petabytes Table DDL Columnstore Index Order CREATE TABLE [dbo]. [lineitem] ( [l_shipdate]

Scaling to Petabytes Table DDL Columnstore Index Order CREATE TABLE [dbo]. [lineitem] ( [l_shipdate] date, [l_orderkey] int ) WITH ( CLUSTERED COLUMNSTORE INDEX ORDER (l_partkey), PARTITION ([l_shipdate] RANGE RIGHT FOR VALUES (N'1998 -10 -01', N'1998 -11 -01')), DISTRIBUTION = HASH ( [l_orderkey] )) GO

Data size on disk - 12% Scaling to Petabytes Columnar ordering Efficient segment elimination

Data size on disk - 12% Scaling to Petabytes Columnar ordering Efficient segment elimination Improved Data Compression Without Ordering With Ordering

Scaling to Petabytes Materialized Views Transactional consistently to data modification Automatic Query Optimizer matching

Scaling to Petabytes Materialized Views Transactional consistently to data modification Automatic Query Optimizer matching CREATE MATERIALZIED VIEW vw_Product. Sales WITH (DISTRIBUTION = HASH(Product. Key)) AS SELECT Product. Name Product. Key, SUM(Amount) AS Total. Sales FROM Fact. Sales fs INNER JOIN Dim. Product dp ON fs. prodkey = dp. prodkey GROUP BY Product. Name, Product. Key

Fact. Sales Table 10 B Records Fact. Sales Dim. Product Table 1, 000 Records

Fact. Sales Table 10 B Records Fact. Sales Dim. Product Table 1, 000 Records Product. Name Product. Key Total. Sales Product A 5453 784, 943. 00 Product B 763 48, 723. 00 … … … Scaling to Petabytes Materialized Views Transactional consistently to data modification Automatic Query Optimizer matching SELECT Product. Name Product. Key, SUM(Amount) AS Total. Sales FROM Fact. Sales fs INNER JOIN Dim. Product dp GROUP BY Product. Name, Product. Key Fact. Inventory Table mvw_Product. Sales 1, 000 Records

SELECT c_customerkey, c_nationkey, SUM(l_quantity), SUM(l_extendedprice) FROM [dbo]. [lineitem_Month. Partition] l Scaling to Petabytes INNER

SELECT c_customerkey, c_nationkey, SUM(l_quantity), SUM(l_extendedprice) FROM [dbo]. [lineitem_Month. Partition] l Scaling to Petabytes INNER JOIN [dbo]. [orders] o on o. o_orderkey = l. l_orderkey INNER JOIN [dbo]. [customer] c on c. c_customerkey = o. o_customerkey GROUP BY c_customerkey, Materialized Views Transactional consistently to data modification Automatic Query Optimizer matching c_nationkey Table Distributions [dbo]. [lineitem_Month. Partition] HASH(l_orderkey) [dbo]. [orders] HASH(o_orderkey) [dbo]. [customer] HASH(c_customerkey)

FROM [dbo]. [lineitem_Month. Partition] l INNER JOIN [dbo]. [orders] o on o. o_orderkey =

FROM [dbo]. [lineitem_Month. Partition] l INNER JOIN [dbo]. [orders] o on o. o_orderkey = l. l_orderkey INNER JOIN [dbo]. [customer] c on c. c_customerkey = o. o_customerkey Scaling to Petabytes Line. Item Orders Collocated Join (Distribution Aligned) Non-collocated Join (Shuffle Required) Materialized Views Transactional consistently to data modification Automatic Query Optimizer matching Customer

Stage 1 Orders Line. Item Collocated Join (Distribution Aligned) Scaling to Petabytes Materialized Views

Stage 1 Orders Line. Item Collocated Join (Distribution Aligned) Scaling to Petabytes Materialized Views (Shuffle Required) Stage 2 #temp (Orders + Lineitem) Nation Collocated Join (Distribution Aligned) Transactional consistently to data modification Automatic Query Optimizer matching Collocated Join (Replicate Aligned) Customer

Scaling to Petabytes Materialized Views Transactional consistently to data modification Automatic Query Optimizer matching

Scaling to Petabytes Materialized Views Transactional consistently to data modification Automatic Query Optimizer matching CREATE MATERIALIZED VIEW mvw_Customer. Sales WITH (DISTRIBUTION = HASH(o_custkey)) AS SELECT o_custkey, l_shipdate, SUM(l_quantity) AS l_quantity, SUM(l_extendedprice) AS l_extendedprice FROM [dbo]. [lineitem_Month. Partition] l INNER JOIN [dbo]. [orders] o on o. o_orderkey = l. l_orderkey WHERE l_shipdate >= CONVERT(DATETIME, '1998 -11 -01', 103) GROUP BY o_custkey, l_shipdate

mvw_Customer. Sales Collocated Join (Distribution Aligned) Scaling to Petabytes Customer Collocated Join (Replicate Aligned)

mvw_Customer. Sales Collocated Join (Distribution Aligned) Scaling to Petabytes Customer Collocated Join (Replicate Aligned) Nation Materialized Views Transactional consistently to data modification Automatic Query Optimizer matching Legend <replicated table>

Query Execution Time 300 275 250 Materialized Views Seconds Scaling to Petabytes 200 150

Query Execution Time 300 275 250 Materialized Views Seconds Scaling to Petabytes 200 150 100 Transactional consistently to data modification Automatic Query Optimizer matching 50 5 0 No Materialized View With Materialized View

Power BI Scaling to Petabytes Tables Power BI Direct. Query Composite Models Aggregation Tables

Power BI Scaling to Petabytes Tables Power BI Direct. Query Composite Models Aggregation Tables Materialized Views

Geography Customer Sales Power BI Import Date Product Reseller Sales Employee Data Warehouse Table

Geography Customer Sales Power BI Import Date Product Reseller Sales Employee Data Warehouse Table Power BI In-Memory

Geography Customer Sales Power BI Direct Query Date Product Reseller Sales Employee Data Warehouse

Geography Customer Sales Power BI Direct Query Date Product Reseller Sales Employee Data Warehouse Table Power BI In-Memory

Geography Customer Sales Date Product Reseller Sales Employee

Geography Customer Sales Date Product Reseller Sales Employee

Geography Sales Agg Customer Sales Date Product Reseller Sales Employee

Geography Sales Agg Customer Sales Date Product Reseller Sales Employee

Customer. Key Amount 489233 Customer. Key Minimum > Maximum > 120934 567823 Scaling to

Customer. Key Amount 489233 Customer. Key Minimum > Maximum > 120934 567823 Scaling to Petabytes Alternate path to seek records Fine grained storage for fast response 2102 5423 293402 Secondary Indexes 1 569462 5427 7230 7320 9435 Amount 1 # 5000 # 10000 # 15000 # 20000 # 25000 # 30000 # 35000 # 40000 # 45000 # 50000 # 55000 # 60000 # 65000 # 70000 # 75000 # 80000 # 85000 # 90000 # 95000 #

Scaling to Petabytes Execution 2 Cache Hit ~. 2 seconds Result set Cache Automatic

Scaling to Petabytes Execution 2 Cache Hit ~. 2 seconds Result set Cache Automatic query matching Implicit creating from query activity Resilient to cluster elasticity Execution 1 Cache Miss Regular Execution

 SELECT Product. Name SUM(Amount) AS Total. Sales FROM Fact Sales GROUP BY Product.

SELECT Product. Name SUM(Amount) AS Total. Sales FROM Fact Sales GROUP BY Product. Name) q INNER JOIN. . . Materialized Views Scaling to Petabytes Subtle Differences Result set Cache Materialized Views SELECT Product. Name SUM(Amount) AS Total. Sales FROM Fact Sales GROUP BY Product. Name) q INNER JOIN. . . Resultset Cache

Scaling to Petabytes Elastic Architecture Columnar Storage Columnar Ordering Table Partitioning Nonclustered Indexes Hash

Scaling to Petabytes Elastic Architecture Columnar Storage Columnar Ordering Table Partitioning Nonclustered Indexes Hash Distribution Materialized Views Resultset Cache

https: //mybuild. microsoft. com

https: //mybuild. microsoft. com

Thank you for attending Build 2019

Thank you for attending Build 2019