Accelerating Data Ingestion with Databricks Autoloader

Описание к видео Accelerating Data Ingestion with Databricks Autoloader

Tracking which incoming files have been processed has always required thought and design when implementing an ETL framework. The Autoloader feature of Databricks looks to simplify this, taking away the pain of file watching and queue management. However, there can also be a lot of nuance and complexity in setting up Autoloader and managing the process of ingesting data using it. After implementing an automated data loading process in a major US CPMG, Simon has some lessons to share from the experience.

This session will run through the initial setup and configuration of Autoloader in a Microsoft Azure environment, looking at the components used and what is created behind the scenes. We’ll then look at some of the limitations of the feature, before walking through the process of overcoming these limitations. We will build out a practical example that tackles evolving schemas, applying transformations to your stream, extracting telemetry from the process and finally, how to merge the incoming data into a Delta table.

After this session you will be better equipped to use Autoloader in a data ingestion platform, simplifying your production workloads and accelerating the time to realise value in your data!

Get insights on how to launch a successful lakehouse architecture in Rise of the Data Lakehouse by Bill Inmon, the father of the data warehouse. Download the ebook: https://dbricks.co/3L8PFQL

Connect with us:
Website: https://databricks.com
Facebook:   / databricksinc  
Twitter:   / databricks  
LinkedIn:   / databricks  
Instagram:   / databricksinc   Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-nam...

Комментарии

Информация по комментариям в разработке