PySpark Transformations and Actions | show, count, collect, distinct, withColumn, filter, groupby

Описание к видео PySpark Transformations and Actions | show, count, collect, distinct, withColumn, filter, groupby

In this video, I will show you how to apply basic transformations and actions on a Spark dataframe. We will explore show, count, collect, distinct, withColumn, withColumnRenamed, filter, groupby, lit, when.

Below are the contents of this video:

1. Get distinct values in a column
2. Create new constant columns
3. Create new columns housing conditions
4. Filter data based on a rule
5. Groupby and aggregate columns 
6. Rename columns in a DF
7. Rename columns using alias after aggregation
8. Collect the output of a dataframe
9. Show top N rows
10. Count number of rows in a DF

Link to the playlist "Getting started with PySpark" :    • Getting started with PySpark  

Link to "Setting up the PySpark environment on Google Colab" video:    • Setting up the PySpark environment on...  

Link to the GitHub repo: https://github.com/Abhishekmamidi123/...

Check out my "Data Science guide for freshers and enthusiasts" playlist:    • My path to becoming a Data Scientist ...  
I have put my 3 years of learning experience into this playlist.

Contents of this video:
00:00 - Contents of this video
01:59 - Setting up the PySpark environment
02:19 - Initialize Spark Session object
02:28 - Read data from UCI
02:57 - show()
03:24 - count()
03:44 - collect()
05:25 - distinct()
06:37 - withColumn()
09:10 - withColumnRenamed()
10:11 - filter()
12:24 - groupby()
14:19 - Summary and Subscribe :)

Please do like, share and subscribe to this channel and share this video with your friends. Keep learning :)

Follow me here:
LinkedIn:   / abhishekmamidi  
Blog: https://www.abhishekmamidi.com/
GitHub: http://github.com/Abhishekmamidi123
Kaggle: http://www.kaggle.com/abhishekmamidi

Tags:
abhishek mamidi, data science, machine learning, deep learning, artificial intelligence, internship, career, college, job, experience, krish naik, ai engineering, fresher, data science enthusiasts, pyspark, apache spark, python, pysparkling

Комментарии

Информация по комментариям в разработке