I recently wrote code for Gaussian Mixture Model (GMM) based clustering in C++. As always, I found it much convenient to use OpenCV for manipulating matrices. Although there already exist an implementation of Expectation Maximization-based GMM, I tried to understand it by writing my own implementation.
The basic idea of GMM is to first randomly assign each sample to a cluster. This provides initial mixture model for clustering. This is then optimized using Expectation - or the probability/score of assigning each sample to each component in GMM - and Maximization - or updating the characteristics of each mixture component with the given probability/score . An attractive attribute of GMM is its ability to cluster data that does not have clear boundaries for clusters. This is achieved by having a probability/score for each sample from each cluster component.
The basic idea of GMM is to first randomly assign each sample to a cluster. This provides initial mixture model for clustering. This is then optimized using Expectation - or the probability/score of assigning each sample to each component in GMM - and Maximization - or updating the characteristics of each mixture component with the given probability/score . An attractive attribute of GMM is its ability to cluster data that does not have clear boundaries for clusters. This is achieved by having a probability/score for each sample from each cluster component.