Crypto Price Trend Analysis with PySpark | Stock Market & E-commerce Use Cases 2025
🔍 Want to analyze cryptocurrency price trends and detect profitable patterns? In this video, I’ll show you how to use PySpark to process and analyze historical coin price data to detect trends like consecutive price increases, volume surges, and more! 📊
📌 What You’ll Learn in This Video:
✅ How to create sample crypto price data (Bitcoin, Ethereum, Solana) in PySpark
✅ Use Window Functions to find coins with 3+ days of consecutive price growth
✅ Running optimized queries for trend detection
💡 Where Else Can You Use This Concept?
This method isn’t just for crypto! The same technique is used in:
📈 Stock Market – Find stocks with consecutive price gains
🛒 E-commerce – Identify trending products with increasing sales
🌦️ Weather Data – Detect heatwaves with rising temperatures
🏢 Employee Productivity – Track employees with continuous performance improvements
🚀 Real-World Applications:
🔹 Traders use this analysis to find momentum stocks & crypto trends
🔹 Retailers track bestselling products
🔹 HR teams identify top-performing employees
🛠 Technologies Used:
PySpark for data processing
SQL & Window Functions for trend detection
Pandas for small-scale analysis
📊 Dataset: Historical cryptocurrency prices 📈
📌 Code Available below:
from pyspark.sql import SparkSession
from pyspark.sql.window import Window
import pyspark.sql.functions as F
Sample Data creation
data = [
(1, "Bitcoin", "2024-02-01", 430, 435),
(1, "Bitcoin", "2024-02-02", 435, 440),
(1, "Bitcoin", "2024-02-03", 440, 439),
(1, "Bitcoin", "2024-02-04", 445, 430),
(2, "Ethereum", "2024-02-01", 320, 325),
(2, "Ethereum", "2024-02-02", 325, 330),
(2, "Ethereum", "2024-02-03", 330, 335),
(2, "Ethereum", "2024-02-04", 335, 340),
(3, "Solana", "2024-02-01", 120, 125),
(3, "Solana", "2024-02-02", 125, 130),
(3, "Solana", "2024-02-03", 130, 135),
(3, "Solana", "2024-02-04", 135, 140),
]
columns = ["Coin_ID", "Coin_Name", "Date", "Open_Price", "Close_Price"]
df = spark.createDataFrame(data, columns)
Convert Date to proper format
df = df.withColumn("Date", F.to_date(F.col("Date"), "yyyy-MM-dd"))
Define Window function to check previous 2 days
windowSpec = Window.partitionBy("Coin_ID").orderBy("Date")
df_with_lag = df.withColumn("Prev_Close_1", F.lag("Close_Price", 1).over(windowSpec)) \
.withColumn("Prev_Close_2", F.lag("Close_Price", 2).over(windowSpec))
Filter coins that had a price increase for 3 consecutive days
df_result = df_with_lag.filter(
(F.col("Close_Price") greater than F.col("Prev_Close_1")) &
(F.col("Prev_Close_1") greater than F.col("Prev_Close_2"))
).select("Coin_ID", "Coin_Name").distinct()
df_result.display()
✅ Subscribe for more data engineering & PySpark tutorials! 🚀
#pyspark #cryptoanalysis #bitcoin #ethereum #solana #datascience #bigdata #cryptotrading #python #stockmarket #ecommerce #machinelearning #finance #trendanalysis #dataengineering #education #azuredataengineer #picoin #2025
Информация по комментариям в разработке