OPENSTACK HA PAYPAL Open Stack Summit Hong Kong

  • Slides: 17
Download presentation
OPENSTACK HA @PAYPAL Open Stack Summit – Hong Kong - 2013

OPENSTACK HA @PAYPAL Open Stack Summit – Hong Kong - 2013

ABOUT PAYPAL Pay. Pal offers flexible and innovative payment solutions for consumers and merchants

ABOUT PAYPAL Pay. Pal offers flexible and innovative payment solutions for consumers and merchants of all sizes. • 137, 000 users • $300, 000 payments processed each minute • 193 markets / 26 currencies • The World’s Most Widely Used Digital Wallet 2

AGENDA Why HA is important for Pay. Pal? Our Learning Our Solution What is

AGENDA Why HA is important for Pay. Pal? Our Learning Our Solution What is not solved? Q&A 3

WHY HA IS IMPORTANT? “no perceived downtime” for cloud users Enterprise Class Auto Scaling

WHY HA IS IMPORTANT? “no perceived downtime” for cloud users Enterprise Class Auto Scaling & Flex up/down can never break API Integrations always succeed Everyone expected to use the cloud 4

AVAILABILITY REQUIREMENTS No SPOF “Under the Cloud” Scale Across the Data Center(s) Scale Across

AVAILABILITY REQUIREMENTS No SPOF “Under the Cloud” Scale Across the Data Center(s) Scale Across Racks & Containers Respect natural availability zones within the data centers No ‘cloud’ can impact any other ‘cloud’ 5

INFRASTRUCTURE RACK Layer 2 versus Layer 3 10 g Active 10 g Passive 1

INFRASTRUCTURE RACK Layer 2 versus Layer 3 10 g Active 10 g Passive 1 g Mgmt Infrastructure / Controller Racks 10 g Passive 10 g Active LB Passive 1 g Mgmt 6 10 g Active Compute Racks … 10 g Passive … 1 g Mgmt LB Active 10 g Passive Access 10 g Active Cattle & Puppies

INFRASTRUCTURE RACK Open. Stack Services are all VM on KVM Every infra component resides

INFRASTRUCTURE RACK Open. Stack Services are all VM on KVM Every infra component resides on 2+ nodes Redundant physical racks Redundant power/switches in each rack Layer-3 connectivity between racks (no Layer 2) Enterprise Grade Physical LB (floating VIP) 7

COMPUTE 1 2 LB Active Access LB Passive LB Active LB Passive 1 g

COMPUTE 1 2 LB Active Access LB Passive LB Active LB Passive 1 g Mgmt 10 g Passive 10 g Active 10 g Passive 10 g Active 10 g Passive 10 g Active Compute Node 96 Hyperscale 16 Core 256 GB Ram 1. 1 T Disk 1 g Mgmt 10 g Passive Compute Node 96 Hyperscale 16 Core 256 GB Ram 1. 1 T Disk 1 g Mgmt 10 g Active 8 10 g Passive Compute Node 96 Hyperscale 16 Core 256 GB Ram 1. 1 T Disk 1 g Mgmt 3 Compute Node 96 Hyperscale 16 Core 256 GB Ram 1. 1 T Disk

COMPUTE 10 g 10 g bond 0 Active 1 g Top Of Rack Management

COMPUTE 10 g 10 g bond 0 Active 1 g Top Of Rack Management Passive 10 g bond 0 Hyperscale Raid-10 1 g 9 10 g 1 g Top Of Rack 1 g Hyperscale Raid-10

OPENSTACK SERVICES

OPENSTACK SERVICES

OPENSTACK CONSIDERATIONS LB VIP for every service (unless it can’t) Connect to LB VIP,

OPENSTACK CONSIDERATIONS LB VIP for every service (unless it can’t) Connect to LB VIP, not individual nodes Script to close Server Connections Pacemaker only works inside a single Layer-2 (not a large enterprise) Auto Restart using Monit My. SQL Swift Cluster 11

CONTINUED… HEAT with Corosync/Pacemaker/keepalived (for now) Key. Stone / Nova / Glance / Swift

CONTINUED… HEAT with Corosync/Pacemaker/keepalived (for now) Key. Stone / Nova / Glance / Swift Proxy Rabbit MQ Cluster Cinder Volume Service 12

CINDER SERVICES WORKFLOW User request (create volume) 1 Cinder API 2 AMPQ 5 Cinder

CINDER SERVICES WORKFLOW User request (create volume) 1 Cinder API 2 AMPQ 5 Cinder Volume 6 Storage Backend 1 13 Cinder Scheduler 3 Storage Backend 2 4 Figure shows a typical interaction between Cinder components to serve a end user request. (create new volume in this example).

CINDER SERVICES WITH HA User request (create volume) 1 How HA is implemented for

CINDER SERVICES WITH HA User request (create volume) 1 How HA is implemented for Cinder Components: Load Balancer Cinder Scheduler A 2 Cinder API A Cinder Scheduler B Cinder API B AMPQ Cluster 3 4 5 Cinder Volume A Cinder Volume B 6 14 Storage Backend 1 Storage Backend 2 • API (stateless) – Load Balancer (A/A or A/P); • Scheduler (stateless) – Pacemaker, Queue itself (A/A or A/P); • Volume – Pacemaker, Queue itself (A/A or A/P).

UNRESOLVED VIP-friendly Cinder Volume service Seamless Upgrade Flip Failed DB TX Reconciliation Consistent API

UNRESOLVED VIP-friendly Cinder Volume service Seamless Upgrade Flip Failed DB TX Reconciliation Consistent API Response Time 15

cloud@paypal. com 16 Confidential and Proprietary

cloud@paypal. com 16 Confidential and Proprietary

THANK YOU HTTP: //GITHUB. COM/PAYPAL/AURORA SCOTT CARLSON - @RELAXED 137 RAJ GEDA ZHITENG HUANG

THANK YOU HTTP: //GITHUB. COM/PAYPAL/AURORA SCOTT CARLSON - @RELAXED 137 RAJ GEDA ZHITENG HUANG IRC: WINSTON-D