The BEST library for building Data Pipelines...

Описание к видео The BEST library for building Data Pipelines...

Building data pipelines with #python is an important skill for data engineers and data scientists. But what's the best library to use? In this video we look at three options: pandas, polars, and spark (pyspark).

Timeline:
00:00 Data Pipelines
01:11 The Data
02:32 Pandas
04:34 Polars
06:15 PySpark
09:15 Spark SQL

Follow me on twitch for live coding streams:   / medallionstallion_  

My other videos:

Speed Up Your Pandas Code:    • Make Your Pandas Code Lightning Fast  
Intro to Pandas video:    • A Gentle Introduction to Pandas Data ...  
Exploratory Data Analysis Video:    • Exploratory Data Analysis with Pandas...  

Working with Audio data in Python:    • Audio Data Processing in Python  
Efficient Pandas Dataframes:    • Speed Up Your Pandas Dataframes  

* Youtube: https://youtube.com/@robmulla?sub_con...
* Discord:   / discord  
* Twitch:   / medallionstallion_  
* Twitter:   / rob_mulla  
* Kaggle: https://www.kaggle.com/robikscube

#python #polars #spark #dataengineering

Комментарии

Информация по комментариям в разработке