Trino for Large Scale ETL at Lyft

Описание к видео Trino for Large Scale ETL at Lyft

At Lyft, we are processing petabytes of data daily through Trino for various use cases. A single query can execute as long as 4 hours with terabytes of memory reserved. There are quite many challenges to operate Trino ETL at such a scale: how to make all queries as performant as possible with low failures rates; how should we define clusters, routing groups and resource groups for changing volume across a day; how to keep commitment to user SLOs during unexpected spikes, etc.

Lyft shares what they've done with our config tunings, large query/user identifications, autoscaling and fault tolerant features to execute Trino at such a scale. We'd also like to share our upcoming challenges and plans to move steps further with Trino adoption across the company.

Комментарии

Информация по комментариям в разработке