AWS re:Invent 2023 - Netflix’s journey to an Apache Iceberg–only data lake (NFX306)

Описание к видео AWS re:Invent 2023 - Netflix’s journey to an Apache Iceberg–only data lake (NFX306)

Netflix operates a data lake of approximately one exabyte. Despite this, a portion of data (about 300 petabytes) remained in the legacy Apache Hive table format. Motivated by the well-known benefits Apache Iceberg provides, such as time travel and schema evolution, Netflix fully phased out Hive and transitioned existing data to Iceberg. In this session, learn how Netflix managed this task at the appropriate scale with custom tooling and how they developed unique in-house features like secure Iceberg tables and the Iceberg REST catalog. Learn about Netflix’s journey from a Hive-based to an Iceberg-only data warehouse and how Netflix overcame the challenges that arose with the transition.

Learn more about AWS re:Invent at https://go.aws/46iuzGv.

Subscribe:
More AWS videos: http://bit.ly/2O3zS75
More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#AWSreInvent #AWSreInvent2023

Комментарии

Информация по комментариям в разработке