Principal Component Analysis in R: Example with Predictive Model & Biplot Interpretation

Описание к видео Principal Component Analysis in R: Example with Predictive Model & Biplot Interpretation

Provides steps for carrying out principal component analysis in r and use of principal components for developing a predictive model.
R code: https://github.com/bkrai/Top-10-Machi...
00:00 Introduction - Principal Component Analysis in R
00:05 Iris Data
01:16 Partition Data
02:06 Scatter Plots Correlation Coefficients
05:02 Principal Component Analysis
10:17 Orthogonality of Principal Component
11:38 Bi - Plot interpretation
18:31 Prediction with Principal Components
19:50 Multinomial Logistic Regression Model with First Two PCs
21:07 Confusion Matrix & Misclassification Error ‘Training Data’
22:25 Confusion Matrix & Misclassification Error ‘Testing Data’
22:48 PCA Advantage
23:24 PCA Disadvantage

What is Principal Component Analysis?
- Principal Component Analysis (PCA) is a statistical technique widely used for dimensionality reduction in data analysis and visualization. It transforms a dataset consisting of possibly correlated variables into a set of linearly uncorrelated variables known as principal components. These components are ordered so that the first few retain most of the variation present in the original dataset. This makes PCA a powerful tool for extracting the most important features from a dataset, simplifying the complexity in high-dimensional data while preserving as much information as possible. The process involves calculating the eigenvalues and eigenvectors of the data's covariance matrix, which help in identifying the directions of maximum variance in high-dimensional data. By projecting the original data onto these new axes, PCA facilitates data compression, noise reduction, and the identification of underlying patterns, making it invaluable for exploratory data analysis, predictive modeling, and visualizing genetic data, among other applications.

principal component analysis is an important statistical tool related to analyzing big data or working in data science field.

Machine Learning videos: https://goo.gl/WHHqWP
Becoming Data Scientist: https://goo.gl/JWyyQc
Introductory R Videos: https://goo.gl/NZ55SJ
Deep Learning with TensorFlow: https://goo.gl/5VtSuC
Image Analysis & Classification: https://goo.gl/Md3fMi
Text mining: https://goo.gl/7FJGmd
Data Visualization: https://goo.gl/Q7Q2A8
Playlist: https://goo.gl/iwbhnE

R is a free software environment for statistical computing and graphics, and is widely used by both academia and industry. R software works on both Windows and Mac-OS. It was ranked no. 1 in a KDnuggets poll on top languages for analytics, data mining, and data science. RStudio is a user friendly environment for R that has become popular.

Комментарии

Информация по комментариям в разработке