TIME SERIES CLUSTERING | HDBSCAN for Clustering 811 Products Sales

Описание к видео TIME SERIES CLUSTERING | HDBSCAN for Clustering 811 Products Sales

In this video, we are going to learn HDBSCAN, which is a density-based algorithm for clustering. Then, we will apply it to find clusters of weekly sales transactions. HDBSCAN can be used with any distance metric, but we will use two only: Euclidean and Dynamic Time Warping (DTW). We will see how the clustering results differ between the distance formulas.

Source code:
https://www.kaggle.com/code/leessteph... (real data set)
https://www.kaggle.com/leesstephanie/... (synthetic data set)
https://github.com/stephanielees/HDBS...

More explanation about linkage: https://youtube.com/clip/Ugkx1r3GK144...

00:00 Intro
01:16 The intuition of HDBSCAN
01:53 Preparing for going through HDBSCAN algorithm
05:58 Core distance
07:18 Mutual reachability distance
10:38 Minimum Spanning Tree
11:26 Single Linkage Tree, Condensed Tree
20:39 Cluster selection

Application with Python:
24:09 Load data
26:51 Visualization
28:39 Apply HDBSCAN with Euclidean distance
33:35 Apply HDBSCAN with DTW distance
37:07 Discussion

#timeseries #clustering #machinelearning #retailsales #sales #datascience #pythonprogramming #timeseriesclustering

Комментарии

Информация по комментариям в разработке