✨ Visual Editor

close

palette Canvas & Background

Gradient:arrow_forward
Text Color:
135°

style Card Style

40px
16px

text_fields Typography

16px
Matt Dancho (Business Science)
@mdancho84
The 10 types of clustering that all data scientists need to know.

Let's dive in:
Thread image
Matt Dancho (Business Science)
@mdancho84
1. K-Means Clustering:

This is a centroid-based algorithm, where the goal is to minimize the sum of distances between points and their respective cluster centroid.
Thread image
Matt Dancho (Business Science)
@mdancho84
2. Hierarchical Clustering:

This method creates a tree of clusters. It is subdivided into Agglomerative (bottom-up approach) and Divisive (top-down approach).
Thread image
Matt Dancho (Business Science)
@mdancho84
3. DBSCAN (Density-Based Spatial Clustering of Applications with Noise):

This algorithm defines clusters as areas of high density separated by areas of low density.
Thread image
Matt Dancho (Business Science)
@mdancho84
4. Mean Shift Clustering:

It is a centroid-based algorithm, which updates candidates for centroids to be the mean of points within a given region.
Matt Dancho (Business Science)
@mdancho84
5. Gaussian Mixture Models (GMM):

This method uses a probabilistic model to represent the presence of subpopulations within an overall population without requiring to assign each data point to a cluster.
Thread image
Matt Dancho (Business Science)
@mdancho84
6. Spectral Clustering:

It uses the eigenvalues of a similarity matrix to reduce dimensionality before applying a clustering algorithm, typically K-means.
Thread image
Matt Dancho (Business Science)
@mdancho84
7. OPTICS (Ordering Points To Identify the Clustering Structure):

Similar to DBSCAN, but creates a reachability plot to determine clustering structure.
Matt Dancho (Business Science)
@mdancho84
8. Affinity Propagation:

It sends messages between pairs of samples until a set of exemplars and corresponding clusters gradually emerges.
Matt Dancho (Business Science)
@mdancho84
9. BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies):

Designed for large datasets, it incrementally and dynamically clusters incoming multi-dimensional metric data points.
Matt Dancho (Business Science)
@mdancho84
10. CURE (Clustering Using Representatives):

It identifies clusters by shrinking each cluster to a certain number of representative points rather than the centroid.
Thread image
Matt Dancho (Business Science)
@mdancho84
EVERY DATA SCIENTIST NEEDS TO LEARN AI IN 2025.

99% of data scientists are overlooking AI.

I want to help.
Matt Dancho (Business Science)
@mdancho84
On Wednesday, May 21st, I'm sharing one of my best AI Projects: Customer Segmentation Agent with AI

Register here (500 seats): learn.business-science.io/ai-register
Thread image
Generated by Thread Navigator
100%
view_carousel Carousel Studio NEW
Press + S to quick-export