Spark-Submit - Deploy PySpark Jobs on a Standalone Single Node

Описание к видео Spark-Submit - Deploy PySpark Jobs on a Standalone Single Node

00:00 - spark-submit --help
02:30 - location of spark-submit executable
04:12 - confirm your .bashrc has spark environment path set
05:50 - spark-master.sh + other shells scripts in sbin directory
07:01 - start spark master with sbin/spark-master.sh
07:54 - localhost:8080 for spark master in a web browser
08:24 - start a worker node with bin/spark-class
10:55 - location of default spark script examples
13:07 - run example script pi.py with spark-submit
17:04 - run example script wordcount.py with spark-submit
23:25 - validate spark jobs in spark master web UI
24:07 - word count on Donald Trump tweets

If you're a beginner to Spark or Python then you might find these other videos useful

Apache Spark - Install Spark3, PySpark3 on Ubuntu 20.04, Debian, Python 3.8 - Part 1b
   • Apache Spark - Install Spark3, PySpar...  

PySpark Framework - Python Functional and Object Orientated Programming - Part 1 - ETL
   • PySpark Framework - Python Functional...  

Python Install - compile from source with SSL module, pip TLS/ SSL - Debian, Ubuntu
   • Python Install - compile from source ...  

Комментарии

Информация по комментариям в разработке