K-fold cross validation using Scikit-Learn

Описание к видео K-fold cross validation using Scikit-Learn

𝐊-𝐟𝐨𝐥𝐝 𝐜𝐫𝐨𝐬𝐬 𝐯𝐚𝐥𝐢𝐝𝐚𝐭𝐢𝐨𝐧 is a technique used for evaluating the performance of machine learning models. It uses different portions of the dataset as train and test sets in multiple iterations and helps a model to generalize well on unseen data.

Scikit-Learn's 𝐭𝐫𝐚𝐢𝐧_𝐭𝐞𝐬𝐭_𝐬𝐩𝐥𝐢𝐭 method uses a fixed set of samples as the train set and the rest of the samples outside the train set as the test set, which can often result in high variance. On the other hand, the K-fold cross validation method provides a more robust estimate of a model's performance.

After performing cross validation over the entire dataset, we calculate the mean score on all the folds. 𝑰𝒇 𝒘𝒆 𝒖𝒔𝒆 𝒎𝒖𝒍𝒕𝒊𝒑𝒍𝒆 𝒎𝒐𝒅𝒆𝒍𝒔, 𝒘𝒆 𝒅𝒆𝒄𝒊𝒅𝒆 𝒕𝒐 𝒄𝒉𝒐𝒐𝒔𝒆 𝒕𝒉𝒆 𝒎𝒐𝒅𝒆𝒍 𝒘𝒊𝒕𝒉 𝒂 𝒃𝒆𝒕𝒕𝒆𝒓 𝒎𝒆𝒂𝒏 𝒔𝒄𝒐𝒓𝒆 𝒕𝒉𝒂𝒏 𝒕𝒉𝒆 𝒐𝒕𝒉𝒆𝒓𝒔 𝒇𝒐𝒓 𝒂 𝒑𝒂𝒓𝒕𝒊𝒄𝒖𝒍𝒂𝒓 𝒅𝒂𝒕𝒂𝒔𝒆𝒕.

𝗚𝗶𝘁𝗛𝘂𝗯 𝗮𝗱𝗱𝗿𝗲𝘀𝘀: https://github.com/randomaccess2023/M...

𝗜𝗺𝗽𝗼𝗿𝘁𝗮𝗻𝘁 𝘁𝗶𝗺𝗲𝘀𝘁𝗮𝗺𝗽𝘀:
01:40 - Import required libraries
03:24 - Load 𝐜𝐚𝐫_𝐞𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧 dataset
05:59 - Perform preprocessing
11:51 - Separate features and classes
12:19 - Split the dataset
13:41 - Apply 𝐒𝐮𝐩𝐩𝐨𝐫𝐭 𝐕𝐞𝐜𝐭𝐨𝐫 𝐌𝐚𝐜𝐡𝐢𝐧𝐞
16:36 - K-fold cross validation using sklearn's 𝐜𝐫𝐨𝐬𝐬_𝐯𝐚𝐥_𝐬𝐜𝐨𝐫𝐞
21:50 - K-fold cross validation using sklearn's 𝐊𝐅𝐨𝐥𝐝
28:02 - K-fold cross validation using sklearn's 𝐒𝐭𝐫𝐚𝐭𝐢𝐟𝐢𝐞𝐝𝐊𝐅𝐨𝐥𝐝
31:32 - Apply 𝐂𝐚𝐭𝐞𝐠𝐨𝐫𝐢𝐜𝐚𝐥 𝐍𝐚𝐢𝐯𝐞 𝐁𝐚𝐲𝐞𝐬

#sklearn #datascience #kfoldcrossvalidation #jupyternotebook #pythonprogramming #machinelearning

Комментарии

Информация по комментариям в разработке