Expectation Maximization Algorithm | Intuition & General Derivation

Описание к видео Expectation Maximization Algorithm | Intuition & General Derivation

How do you fit Gaussian Mixture Models for clustering high-dimensional data or as generative models? The EM algorithm allows for performing a MLE under latent data. Here are the notes: https://raw.githubusercontent.com/Cey...

The Maximum Likelihood is a great first start for fitting the parameters of a model when you only have access to data. However, it breaks down once your model contents latent random variables, i.e., nodes for which you do not observe any data. A remedy is to take the marginal likelihood instead of the full likelihood, but this approach leads to some difficulties that we have to overcome.

In this video, I show how to derive an upper estimate for the marginal log-likelihood, including all the necessary tricks like importance sampling and Jensen's inequality. We then end up in a chicken-egg problem. Hereby, we need the distribution's parameters to perform an estimate, but we also need the estimate to update the parameters. Consequentially, we have to resort to an iterative algorithm which contains of the E-Step (Expectation) and the M-Step (Maximization).

An Important remark is that the derivations I deliver here are just a framework. For each application scenario, for instance for Gaussian Mixture Models, the framework requires a new maximization to then end up with simple update equations.

-------
Info on why the Expectation Maximization algorithm does not work for the Bernoulli-Bernoulli model:

[TODO] I will work on a video on this, stay tuned ;)

-------

📝 : Check out the GitHub Repository of the channel, where I upload all the handwritten notes and source-code files (contributions are very welcome): https://github.com/Ceyron/machine-lea...

📢 : Follow me on LinkedIn or Twitter for updates on the channel and other cool Machine Learning & Simulation stuff:   / felix-koehler   and   / felix_m_koehler  

💸 : If you want to support my work on the channel, you can become a Patreon here:   / mlsim  

-------

Timestamps:
00:00 Introduction
00:48 Latent means missing data
02:15 How to define the Likelihood?
02:55 Marginal Likelihood
05:05 Disclaimer: It will not work
05:48 Marginal Likelihood (cont.)
06:15 Marginal Log-Likelihood
08:11 Importance Sampling Trick
11:31 Jensen's Inequality
13:03 A lower bound (error, see comments below)
15:23 The Posterior over the latent variables
16:20 A lower bound (cont.) (error, see comments below)
17:56 The Chicken-Egg Problem
20:18 Old and new parameters
21:55 The Maximization Procedure
22:56 A simplified upper bound
25:04 Responsibilities
25:46 The EM Algorithm
28:28 An MLE under missing data
29:07 Outro

Комментарии

Информация по комментариям в разработке