Build Hybrid Data Platform with Azure SQL Database

Build Hybrid Data Platform with Azure SQL Database and SQL Server Xiaochen Wu Senior Program Manager Azure SQL DB team

Agenda • Technology choices for different scenarios • Customer use cases • Limitations, workaround and roadmap • Q&A By 2020, 90 percent of organizations will adopt hybrid infrastructure management capabilities.

Replication between SQL Server bi-direction one direction SQL SERVER Transactional Replication SQL SERVER Peer to peer replication • Not recommended for new projects Change Data Capture + SSIS • Recommended for bi-direction (active-active) replication • Change Data Capture Components by Attunity Merge replication • Not recommended for new projects Availability Group • Read-only secondary • Ideal for HADR • Start from SQL Server 2012 Replication Y SQL SERVER Bi-direction N CDC + SSIS Read-only secondary? N Tran repl Y AG

On prem to Azure migration SQL SERVER AZURE SQL DB SQL SERVER Transactional replication • Small scale • Online with minimum downtime • Replication related knowledge required Migration Y AZURE SQL DB MANAGED INSTANCE N DMS Managed Instance? Bacpac/bcp based • Small scale • Offline migration • No 1433 port required Y backup N DMS • Large scale • Offline/online migration • Network configuration (VPN/Express Route) Y Database backup & restore • Small scale • Offline migration • No 1433 port required Data Sync • Not recommended for one-time migration Large scale? Tran Repl Minimum downtime? N bacpac

Hybrid Replication Hybrid Sync SQL SERVER AZURE SQL DB SQL SERVER Y AZURE SQL DB MANAGED INSTANCE Data Sync • Managed service, easy configuration • Slower initial migration, ideal for small databases • Bi-directional sync, active-active Data Sync • Planned for post Managed Instance GA Transactional replication • On-prem to Azure one direction • Expand/reuse existing replication topology Transactional replication in Managed Instance • On-prem to Azure or Azure to on-prem one direction Transform required? N SSIS Bi-direction sync? Y N SSIS/ADF • Bi-directional sync for MI • Batch load • Transformation required N Tran Repl Azure DB (non-MI) to on-prem? Data Sync Y

Azure Replication Azure Sync AZURE SQL DB Data Sync • Bi-directional sync, active-active • Cross region, cross public and sovereign clouds • Between Azure SQL DB and MI AZURE SQL DB MANAGED INSTANCE Y AZURE SQL DB MANAGED INSTANCE Data Sync • Planned for post Managed Instance GA Transform required? N SSIS Bi-direction sync? Transactional replication in Managed Instance • Full transactional replication support • Migrating existing replication topology Y N Change Data Capture + SSIS/ADF • Bi-direction (active-active) replication • Change Data Capture Components by Attunity N Tran Repl Azure DB (non-MI) to on-prem? Data Sync Y

Failback to on-prem: exit strategy AZURE SQL DB SQL SERVER Data Sync • Keep data accessible from both Azure and on-prem • Large DB Bacpac/bcp based • Offline migration • No 1433 port required AZURE SQL DB MANAGED INSTANCE Transactional replication • Cannot initialize using backup Failback Y SQL SERVER Offline? N bacpac Managed Instance? N Data Sync Y Tran repl

Feature availability Features SQL Server (on-prem or Azure VM) SQL DB (singleton & pool) SQL DB Managed Instance Transactional replication Supported As subscriber only Supported; P 2 P is not supported Merge replication Supported/not recommended Not supported Change Tracking Supported CDC Supported Not supported SSIS Supported IR in ADFv 2 Data Sync with SQL DB only Supported Sync with SQL DB only; Full support post GA

Bi-directional replication using CDC+SSIS Clients download data from the central database to their laptops and bring it to remote sites with no internet access. The laptop need to update data when connected to internet. Merge replication can’t meet the scalability requirement of modern databases. SQL SERVER cdc. Person _Address_ CT Person_A ddress SSIS Person_Ad dress SSIS Customer uses CDC + SSIS to replace Merge replication: 1. Enable CDC on the client database 2. Use SSIS to integrate data changes into central database 3. Use SSIS to refresh the whole dataset in the client database

Synchronize your SQL data using Data Sync Uni-directional or Bi-directional sync across all SQL endpoints using SQL Data Sync. Enables hybrid SQL deployment, access your data from Azure and on premises applications. Deploy data applications globally with low connection latency

Hybrid Sync – CRM migration Customer is migrating Dynamics CRM to Dynamics 365 Customer has successfully synchronized their CRM databases between Azure SQL DB and on prem SQL Server using Data Sync. It allows them to continue to support on premises reporting and Teradata system while running Dynamics CRM in Azure. AZURE SQL DB Azure Data Sync On prem SQL SERVER Legacy Apps

Hybrid Sync – CRM migration Currently, data sync can only sync between databases with less than 500 tables. You can work around this limitation by creating multiple sync groups using different database users. Sync setup requires ALTER DATABASE permission which implies CONTROL permission over all tables so you will need to explicitly DENY the permissions on tables which you don’t want a specific user to see, instead of using GRANT.

Hybrid Sync – Data distribution Data Sync Publisher SQL SERVER My. Expenses AZURE SQL DB SQL SERVER Azure Data Sync On prem SQL SERVER My. Expenses Publisher Tran Repl SQL SERVER AZURE SQL DB Customer is using transactional replication to distribution some datasets from a central SQL Server databases to all clients. They want to modernize this distribution system and avoid the maintenance cost of transactional replication. They are taking a three steps plan: 1. Sync the database to Azure DB and then distribute data to clients (SQL Server) using data sync 2. Migrate on prem applications to Azure and retire the on prem system 3. Add new Azure SQL DB as sync member if clients want to move to Azure

Hybrid Sync – Data distribution (cont. ) Users may hit one of the following issues when trying to sync between many databases: • 30 databases per sync group limitation. • five on-premises SQL Server databases per sync group limitation. • significant performance impact to workload running in the hub database. Benefit: • Group clients based on certain attributes (location, brand…) and use different sync schema and sync frequency. • Easily add more clients when your business is growing. • The forwarders (member databases in the middle layers) can share the sync overhead from the master database.

Azure Sync – multi master GEP expanded the service to Europe in 2016. The application was suffering high latency and high connection failure rate when connection to Azure SQL database in US. GEP uses data sync to synchronize data in the configuration databases between US and Europe, so the data can be colocated with application. US My. Expenses Europe My. Expenses gh Hi AZURE SQL DB y nc e lat Data Sync Australia My. Expenses SMART by GEP is the company’s unified sourcing and procurement platform aimed to enables its users to work anywhere, anytime and on any device. AZURE SQL DB This topology also helped them easily expand the service to Australia and APEC.

Limitations, workaround and roadmap Limitations in database dimension (500 tables…) Create multiple sync groups; Contact product team Sync group with higher capacity (paid) – post GA Row filtering Sync group level row filtering in private preview Sync member level row filtering post GA Skip sync initialization In private preview Post GA Disaster recovery (doesn’t support failover group) Re-create sync group or reinitialization Support failover group and on prem AG – post GA Replicate schema change Workaround using DDL triggers Replicate schema change and flexible schema – post GA Limitations in sync group dimension (30 members…)

Resources • Azure SQL Data Sync • Data Sync best practice • Sync SQL data in large scale using Azure SQL Data Sync • Transaction replication in Managed Instance – ask PG
- Slides: 17