Everyday I'm Shuffling - Tips for Writing Better Apache Spark Programs

Описание к видео Everyday I'm Shuffling - Tips for Writing Better Apache Spark Programs

Want to learn how to write faster and more efficient programs for Apache Spark? Two Spark experts from Databricks, Vida Ha and Holden Karau, provide some performance tuning and testing tips for your Spark applications.

Overview:
Understanding the Shuffle in Spark
Common causes of inefficiency
Understanding when code runs on the drive vs. the workers
Common causes of errors
How to factor your code
For reuse between batch and streaming

View slides at: http://www.slideshare.net/databricks/...

Additional reading:

7 Tips to Debug Apache Spark Code Faster with Databricks
https://databricks.com/blog/2016/10/1...

Databricks Best Practices and Tips
https://docs.databricks.com/user-guid...

About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: https://databricks.com/product/unifie...

Connect with us:
Website: https://databricks.com
Facebook:   / databricksinc  
Twitter:   / databricks  
LinkedIn:   / databricks  
Instagram:   / databricksinc   Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-nam...

Комментарии

Информация по комментариям в разработке