A full application environment for every PR 4


























- Slides: 26
A full application environment for every PR 4
Vishal Biyani Jono Spiro CTO Infra. Cloud Ops Team Open. Gov
Agenda ● Problem Statement 5 Minutes ● Solution Overview 5 Minutes ● Demo 15 Minutes ● Numbers 5 Minutes ● Roadmap Items 5 Minutes
Who needs an ephemeral environment?
Inner/Outer Loop of Development Wish I could try out this feature right now. . . Image Source : https: //mitchdenny. com/the-inner-loop/ Wish I could test this feature with a full stack before PR. . .
Evaluation criteria for an ephemeral environment Cost Per Environment |9 User Experience Complexity & Team Maturity Quality Experience Product Experience
Since everything in K 8 s must be nautically themed. . .
Production is a battle group | 11
Existing solutions are sailing vessels | 12
Where we're going, we want speedboats | 13
Solution Overview
It’s not just skaffolding
Solution Overview AWS Certificate Manager Kubernetes Cluster Dev 1 Route 53 namespace-dev 1 -ephemeral S 3 Buckets $ ephemeral run $ ephemeral dev App 1 App 2 App 3 Postgres Platform Tools namespace-PR-ephemeral $ ephemeral run App 1 App 2 App 3 Postgres Okta Skaffold Helm Terraform Dev Orchestrator Cleanup Controller (https: //github. com/hjacobs/kube-janitor ) Cluster Autoscaler (https: //github. com/kubernetes/autoscaler/tree/ma ster/cluster-autoscaler) Spot Instance Manager (https: //github. com/aws-nodetermination-handler) Cluster Manager Bot. Kube (https: //www. botkube. io/ ) Docker. Hub Sumo. Logic Supporting Tools
Solution Overview Inner Loop AWS Certificate Manager Kubernetes Cluster Route 53 Dev 1 namespace-dev 1 -ephemeral $ ephemeral run $ ephemeral dev S 3 Buckets App 1 App 2 App 3 Postgres Platform Tools namespace-PR-ephemeral $ skaffold run App 1 App 2 App 3 Postgres Okta Skaffold Helm Terraform Dev Orchestrator Cleanup Controller (https: //github. com/hjacobs/kube-janitor ) Cluster Autoscaler (https: //github. com/kubernetes/autoscaler/tree/ma ster/cluster-autoscaler) Spot Instance Manager (https: //github. com/aws-nodetermination-handler) Cluster Manager Bot. Kube (https: //www. botkube. io/ ) Docker. Hub Sumo. Logic Supporting Tools
Solution Overview Outer Loop AWS Certificate Manager Kubernetes Cluster Dev 1 Route 53 namespace-dev 1 -ephemeral $ skaffold run $ skaffold dev S 3 Buckets App 1 App 2 App 3 Postgres Platform Tools namespace-PR-ephemeral $ ephemeral run App 1 App 2 App 3 Postgres Okta Skaffold Helm Terraform Dev Orchestrator Cleanup Controller (https: //github. com/hjacobs/kube-janitor ) Cluster Autoscaler (https: //github. com/kubernetes/autoscaler/tree/ma ster/cluster-autoscaler) Spot Instance Manager (https: //github. com/aws-nodetermination-handler) Cluster Manager Bot. Kube (https: //www. botkube. io/ ) Docker. Hub Sumo. Logic Supporting Tools
Solution Overview Cost Control AWS Certificate Manager Kubernetes Cluster TTL on ephemeral namespace-dev 1 -ephemeral Environments Dev 1 Route 53 Set auto-scaling with min and max limits S 3 Buckets $ skaffold run $ skaffold dev Using AWS Spot Instances. App 1 App 2 App 3 Postgres No environment runs beyond a set time to live. Platform Tools Auto scaling upper limits ensure cost is not blown out of budget. And cluster is downsized when not in use. namespace-PR-ephemeral $ skaffold run Spot Instance Manager handles spot instance termination for workloads. App 1 App 2 App 3 Postgres Okta Skaffold Helm Terraform Dev Orchestrator Cleanup Controller (https: //github. com/hjacobs/kube-janitor ) Cluster Autoscaler (https: //github. com/kubernetes/autoscaler/tree/ma ster/cluster-autoscaler) Spot Instance Manager (https: //github. com/aws-nodetermination-handler) Cluster Manager Bot. Kube (https: //www. botkube. io/ ) Docker. Hub Sumo. Logic Supporting Tools
Solution Overview Monitoring AWS Certificate Manager Kubernetes Cluster Dev 1 namespace-dev 1 -ephemeral $ skaffold run App 1 $ skaffold dev App 2 Route 53 Integrate external monitoring / App 3 Postgres log aggregation tools S 3 Buckets Platform Tools Integrate your existing / third party monitoring or log aggregation tools. namespace-PR-ephemeral $ skaffold run Bot. Kube - Chat with your cluster App 1 App 2 Monitor activities like new ephemeral environment Slack / Skaffold creation / deletion from. Cleanup Controller (https: //github. com/hjacobs/kube-janitor ) Microsoft Teams / Mattermost Helm Terraform Dev Orchestrator App 3 Postgres Okta Cluster Autoscaler (https: //github. com/kubernetes/autoscaler/tree/master/ cluster-autoscaler) Spot Instance Manager (https: //github. com/aws-nodetermination-handler) Bot. Kube (https: //www. botkube. io/ ) Docker. Hub Sumo. Logic Supporting Tools Cluster Manager
Repo Structure front-end Service Repo ephemeral. run Repo github. com/Open. Gov/front-end github. com/Open. Gov/ephemeral. run On Code Change namespace-dev 1 -ephemeral For every deploy request - Build Docker Image When tagged for deploy - Dispatch Event to ephemeral. run repo Ephemeral Auto. Scaled TTLed Instance - Front End Back end DB Update values. yaml with image Deploy to Cluster Communicate details back to dispatching repo via Github Bot ephemeral. run (On Demand)
Demo
Numbers
Numbers (>_<) ● In 2020, engineers started an average of four legacy Chef environments per month. ● These were limited, unreliable, took hours to start, and were not representative of production. ● Legacy environments cost $150/day/env, and thousands per month in fixed costs.
Numbers (^_^) ● 10 x usage: Engineers started 50 environments with ephemeral. run in the first month. ● 10 x faster: Environments start in ~15 minutes, and can be updated in five minutes. ● 10 x cheaper: $15/day/env (billed by the minute) and $200/month in fixed costs. ● Priceless: New defects discovered immediately, and for the first time, pre-merge.
Roadmap Items Join the Project! - A generic, fork-friendly framework with simplified configuration DSL/templates - A loving and proactive @runbot (like Git. Hub's @dependabot) - Suspend/Resume Compute - Dynamic TTLs on cluster resources - Local-to-remote telepresence - CI integration - Smarter Pod scheduling to optimize autoscaling - Bot. Kube integration and Chat. Ops - Centralized Control Plane with UI - Usage reporting and analytics - Budgeting policies
Resources Project and Reference Implementation Fork us to try it, �� us for updates, and open an issue or PR to join us! ephemeral. run (or github. com/Open. Gov/ephemeral. run) Follow our other projects on Git. Hub @infracloudio – @vishal-biyani @Open. Gov – @jspiro Discover what we're building with all this technology infracloud. io opengov. com Work with us! https: //jobs. cncf. io/employers/406951 -infracloud-technologies
Q&A
Thank You!