Business Requirements Application SQL Server requirements Strategy Other
Business Requirements Application / SQL Server requirements Strategy Other HA Solutions Harold Velazquez Harold@Velazquez. ws Disaster Recovery High Availability and Disaster Recovery Solutions How does Licensing affect your HA Strategy?
Business Requirements What is your SLA for High Availability? • Average Speed to answer: Average time to respond to incidents. • Turn around time: Time required to complete the task. • Production uptime: Interrupted availability time/schedule. • Maintenance Schedule: Is there a planned, agreed-upon downtime for the SQL Server? • Acceptable Performance levels. SQL Servers must be operational 24/7 and no Data Loss is acceptable, under any circumstances.
Business Requirements • What is your RPO? • What is your business tolerance to data loss? • What is your RTO expectations? • How long is your business expecting to be back on business? • What are we protecting our Database systems from? • • • Hardware failures Software faults Natural Disasters Human errors Unpredictable events – such electrical outages, circuit disconnects. Scheduled maintenance
HA Technologies • Always. On Failover Cluster • Always. On Availability Group • Database Mirroring • Log Shipping (DR) • Replication (DR) • VM-based solutions
Failover Cluster Overview Instance-level protection Automatic failovers RTO is seconds to minutes RPO is No data loss Shared-Disk, can be single point of failure. • Versions / Editions supported SQL Standard supports two nodes SQL Enterprise: OS limit • Use-Cases: physical protection of the server. • • •
Always. On Availability Group • FCI single point of failure • Enterprise can host multiple databases per AG, while single database per AG. • RPO is zero and RTO is seconds* • Best use case is data storage protection by having multiple copies. • Leveraging Read. Only replicas for scalability. • AG is not supported in SQL 2012 -2014 Standard. SQL 2016 Standard supports basic AG and two nodes with no Read. Only. SQL Enterprise supports 8 secondary replicas - including 2 synchronous secondary replicas
Always. On Availability Group (Synchronous)
Always. On Availability Group (Asynchronous)
Other technologies (HA and DR) • Log Shipping • Takes log backups on schedule applying them on secondary servers • RPO – backup schedule. RTO – Depends on the process • Replication • Replicates subset of the tables across the servers • RPO – Seconds (depending on the type) • VM-Based technology • Migrating VMs across the hosts
Differences between technologies at a glance Protection Granularity Failover Cluster Availability Groups Mirroring Log Shipping Transactional Replication Operating System Database Groups* Database Publication
Differences between technologies at a glance Protection Granularity RPO (Data Loss) Failover Cluster Availability Groups Mirroring Log Shipping Transactional Replication Operating System Database Groups* Database Publication None (Sync) Minutes Depends on latency
Differences between technologies at a glance Protection Granularity RPO (Data Loss) Failover Cluster Availability Groups Mirroring Log Shipping Transactional Replication Operating System Database Groups* Database Publication None (Sync) Minutes Depends on latency Seconds (Sync) Seconds-Minutes Depends N/A RTO (Recovery Time) Seconds-Minutes
Differences between technologies at a glance Protection Granularity RPO (Data Loss) Failover Cluster Availability Groups Mirroring Log Shipping Transactional Replication Operating System Database Groups* Database Publication None (Sync) Minutes Depends on latency Seconds (Sync) Seconds-Minutes Depends N/A Yes (Sync) No No RTO (Recovery Time) Seconds-Minutes Automatic Failover Yes
Differences between technologies at a glance Protection Granularity RPO (Data Loss) Failover Cluster Availability Groups Mirroring Log Shipping Transactional Replication Operating System Database Groups* Database Publication None (Sync) Minutes Depends on latency Seconds (Sync) Seconds-Minutes Depends N/A Yes (Sync) No No None Extra latency in commits RTO (Recovery Time) Seconds-Minutes Automatic Failover Performance Overhead Extra latency in Log Backup commits copies Transaction Log
Differences between technologies at a glance Protection Granularity RPO (Data Loss) Failover Cluster Availability Groups Mirroring Log Shipping Transactional Replication Operating System Database Groups* Database Publication None (Sync) Minutes Depends on latency Seconds (Sync) Seconds-Minutes Depends N/A Yes (Sync) No No None Extra latency in commits RTO (Recovery Time) Seconds-Minutes Automatic Failover Performance Overhead Version/Edition Extra latency in Log Backup commits copies Standard - Two Enterprise 2012 -2017 Deprecated, and nodes Standard 2016 -2017 still supported in 2005 -2017 Enterprise - OS (two nodes) 2012 -2016. Transaction Log 2005 -2017
Differences between technologies at a glance Protection Granularity RPO (Data Loss) Failover Cluster Availability Groups Mirroring Log Shipping Transactional Replication Operating System Database Groups* Database Publication None (Sync) Minutes Depends on latency Seconds (Sync) Seconds-Minutes Depends N/A Yes (Sync) No No None Extra latency in commits RTO (Recovery Time) Seconds-Minutes Automatic Failover Performance Overhead Version/Edition Readable Secondaries Extra latency in Log Backup commits copies Standard - Two Enterprise 2012 -2017 Deprecated, and nodes Standard 2016 -2017 still supported in 2005 -2017 Enterprise - OS (two nodes) 2012 -2016. None Yes. Eight replicas None Yes* Transaction Log 2005 -2017 Yes*
Application / SQL Server requirements • Is SQL Server in a Virtualized environment? • What SQL Server versions are supported by the application(s)? • What are the backup and retention requirements? • What are the CPU and RAM Requirements? • Are the applications Cluster-aware? • Does the Enterprise architecture encourages decoupling databases? • Will there be a need for any SQL Server Enterprise features?
Strategy (Part 1) • Design the provisioning of SQL Servers and databases by grouping them based on: • Performance Tiers (RTO, RPO) • Compliance requirements: Most SOX, some are PCI, others HIPPA, and the rest don’t. • Function: Financial, Departmental, Utility, Reporting • Domain (security) requirements: Some departments just want their own. • SQL Server Edition and version requirements for the application. • Identify and remove (or mitigate) single points of failures. • These grouping requirements improve operations, availability, security, and SQL Licensing, and overall a more stable SQL Server environment.
Strategy (Part 2) • Decoupling vs Consolidating SQL Servers • Consolidation decreases licensing costs and complexity (sometimes). • • Placing all of your eggs in one basket – they are easier to find. But… Coordinating maintenance plans require acts of congress across all tenants. New SQL Server version upgrade? All application databases have to agree on to supporting it. Failover time is increased by the number of the databases and number of open transactions. • Decouple them! But, where do those licensing costs go? • Better performance, security management, patch management, and resource isolation. • Reduces single point of failure, surface area of attack, cumulative REDOs in Always. On Synchronous replication, contention for buffer pool utilization, and spreads resource utilization across environments.
Disaster Recovery • RPO and RTO requirements drive the frequency of your backups (Mostly your RPO) • RTO is your promise to how long it will take bringing systems back to normal operations – Make sure you take into account your backup storage performance during times of stress… • What’s your acceptable data loss in the event of a disaster (RPO)? • Does your D/R have a Recovery Level Objective for your SQL Servers? • This is the granularity of SQL Server(s), Database(s), tables, or even filegroups. This can speed up fulfilling your RPO promise – and may even reduce stress during a disaster! It just requires time and effort to plan. • Perform D/R Tests quarterly • (or once a year depending on your personal risk tolerance level) • Are your physical servers/VMs provisioned with the same RAM/Storage/network?
How does Licensing affect your HA Strategy? • First, the phrase “This is not legal advice” disclaimer comes to mind. Please consult a Microsoft Licensing specialist for your environment. • Physical Operating System vs Virtual Operating System. • Single Instance SQL Servers vs Multiple Instance SQL Servers. • Windows Server 2012 Datacenter edition is the right edition to use and allows running unlimited virtual instances with each license on a single server (p 4). • Windows Server 2012 Standard edition will entitle you to run one instance in the physical OSE and two instances in the virtual OSE with each license. • SQL Server Licensing: Physical vs VM Core
How does Licensing affect your HA Strategy? • Single Instance SQL Servers vs Multiple Instance SQL Servers
How does Licensing affect your HA Strategy? • License Mobility within Server Farms. • “Licenses must be assigned to a server for at least 90 days before you are allowed to reassign it to a different server” (p 23) • Each Server, physical or virtual must be licensed properly regardless of its workload and whether is primary or secondary.
Licensing VMs – per Core, SQL Server 2012 Virtualization Licensing Guide
Licensing VMs – per Core, SQL Server 2012 Virtualization Licensing Guide When hyper-threading is turned on, a core license is required for each thread supporting a virtual core. In the example below, hyper-threading is enabled for the physical processor supporting a VM. Since hyperthreading creates two hardware threads for each physical core, a total of 8 core licenses would be required in this scenario. (p 5)
Terms to consider • • • • Lessons Learned Service-level agreements Recover point objectives Recovery time objectives Maximum time to recover Maximum tolerable downtime Baselining the current workload Baselining the existing v. Sphere implementation Estimating growth rates I/O requirements (I/O per sec, throughput, latency) Storage options (disk type/speed, RAID, flash cache) Software versions (v. Sphere, Windows, SQL Server) Licensing (may determine architecture)
AVAILABILITY = MTBF / (MTBF + MTTR) Availability % Downtime per year Downtime per month Downtime per week Downtime per day 90% ("one nine") 36. 5 days 72 hours 16. 8 hours 2. 4 hours 95% ("one and a half nines") 18. 25 days 36 hours 8. 4 hours 1. 2 hours 97% 10. 96 days 21. 6 hours 5. 04 hours 43. 2 minutes 98% 7. 30 days 14. 4 hours 3. 36 hours 28. 8 minutes 99% ("two nines") 3. 65 days 7. 20 hours 1. 68 hours 14. 4 minutes 99. 5% ("two and a half nines") 1. 83 days 3. 60 hours 50. 4 minutes 7. 2 minutes 99. 80% 17. 52 hours 86. 23 minutes 20. 16 minutes 2. 88 minutes 99. 9% ("three nines") 8. 76 hours 43. 8 minutes 10. 1 minutes 1. 44 minutes 99. 95% ("three and a half nines") 4. 38 hours 21. 56 minutes 5. 04 minutes 43. 2 seconds 99. 99% ("four nines") 52. 56 minutes 4. 38 minutes 1. 01 minutes 8. 64 seconds 99. 995% ("four and a half nines") 26. 28 minutes 2. 16 minutes 30. 24 seconds 4. 32 seconds 99. 999% ("five nines") 5. 26 minutes 25. 9 seconds 6. 05 seconds 864. 3 milliseconds 99. 9999% ("six nines") 31. 5 seconds 2. 59 seconds 604. 8 milliseconds 86. 4 milliseconds 99. 99999% ("seven nines") 3. 15 seconds 262. 97 milliseconds 60. 48 milliseconds 8. 64 milliseconds 99. 999999% ("eight nines") 315. 569 milliseconds 26. 297 milliseconds 6. 048 milliseconds 0. 864 milliseconds 99. 9999999% ("nines") 31. 5569 milliseconds 2. 6297 milliseconds 0. 6048 milliseconds 0. 0864 milliseconds
Related Resources and Links • The magic nines: https: //en. wikipedia. org/wiki/High_availability • Licensing Windows Server 2012 for use with virtualization technologies http: //download. microsoft. com/download/3/D/4/3 D 42 BDC 2 -6725 -4 B 29 B 75 A-A 5 B 04179958 B/Windows. Server 2012 Virtual. Tech_VLBrief. pdf • Official SQL Server on v. Sphere Best Practices Guide http: //www. vmware. com/content/dam/digitalmarketing/vmware/en/pdf/ solutions/sql-server-on-vmware-best-practices-guide. pdf • Always. On Architecture Guide: Building a High Availability and Disaster Recovery Solution by Using Failover Cluster Instances and Availability Groups http: //download. microsoft. com/download/D/2/0/D 20 E 1 C 5 F-72 EA-45059 F 26 FEF 9550 EFD 44/Building_a_HA_and_DR_Solution_using_Always. ON_SQL_F CIs_and_AGs%20 v 1. docx • Distributed Availability Groups https: //docs. microsoft. com/en-us/sql/database-engine/availabilitygroups/windows/distributed-availability-groups
- Slides: 29