Zonal Outage Operational Stories - Jyoti Ranjan Mahapatra & Shyam Jeedigunta, Amazon Web Services

Описание к видео Zonal Outage Operational Stories - Jyoti Ranjan Mahapatra & Shyam Jeedigunta, Amazon Web Services

Don't miss out! Join us at our next Flagship Conference: KubeCon + CloudNativeCon North America in Salt Lake City from November 12 - 15, 2024. Connect with our current graduated, incubating, and sandbox projects as the community gathers to further the education and advancement of cloud native computing. Learn more at https://kubecon.io

Zonal Outage Operational Stories - Jyoti Ranjan Mahapatra & Shyam Jeedigunta, Amazon Web Services

Most datacenters have a notion of “availability zone” as a failure domain. Correlated failures are expected in a single failure domain. Kubernetes cluster administrators deploy Kubernetes control plane, worker nodes, and pods, in a topological spread that can tolerate a single fault domain failure. Such setups achieve high availability and gracefully handle common zonal failures — network partitions, power-loss, reboot, bad software deployments, and so forth. This talk walks through numerous real world zonal outages, from a spectrum of partial to full outage, and the behavior of Kubernetes components in those situations. The speakers operate a large fleet of Kubernetes control plane in Amazon Web Services; they will share stories of zonal outages and improvements that helped achieve greater resiliency for thousands of clusters.

Комментарии

Информация по комментариям в разработке