Visualize Thread by @mdancho84 | Thread Navigator

✨ Visual Editor

palette Canvas & Background

Presets

Custom Colors

Gradient:arrow_forward

Text Color:

Gradient Angle135°

Background Pattern

Grain Texture

Aspect Ratio

style Card Style

Preset

Padding40px

Card Radius16px

Enable Card Shadow

Glassmorphism Effect

Show Watermark AGENCY

Show Timestamps

Show X Logo

text_fields Typography

Font Family

Font Size16px

Matt Dancho (Business Science)

@mdancho84

The 10 types of clustering that all data scientists need to know.

Let's dive in:

Thread image

Matt Dancho (Business Science)

@mdancho84

1. K-Means Clustering:

This is a centroid-based algorithm, where the goal is to minimize the sum of distances between points and their respective cluster centroid.

Thread image

Matt Dancho (Business Science)

@mdancho84

2. Hierarchical Clustering:

This method creates a tree of clusters. It is subdivided into Agglomerative (bottom-up approach) and Divisive (top-down approach).

Thread image

Matt Dancho (Business Science)

@mdancho84

3. DBSCAN (Density-Based Spatial Clustering of Applications with Noise):

This algorithm defines clusters as areas of high density separated by areas of low density.

Thread image

Matt Dancho (Business Science)

@mdancho84

4. Mean Shift Clustering:

It is a centroid-based algorithm, which updates candidates for centroids to be the mean of points within a given region.

Matt Dancho (Business Science)

@mdancho84

5. Gaussian Mixture Models (GMM):

This method uses a probabilistic model to represent the presence of subpopulations within an overall population without requiring to assign each data point to a cluster.

Thread image

Matt Dancho (Business Science)

@mdancho84

6. Spectral Clustering:

It uses the eigenvalues of a similarity matrix to reduce dimensionality before applying a clustering algorithm, typically K-means.

Thread image

Matt Dancho (Business Science)

@mdancho84

7. OPTICS (Ordering Points To Identify the Clustering Structure):

Similar to DBSCAN, but creates a reachability plot to determine clustering structure.

Matt Dancho (Business Science)

@mdancho84

8. Affinity Propagation:

It sends messages between pairs of samples until a set of exemplars and corresponding clusters gradually emerges.

Matt Dancho (Business Science)

@mdancho84

9. BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies):

Designed for large datasets, it incrementally and dynamically clusters incoming multi-dimensional metric data points.

Matt Dancho (Business Science)

@mdancho84

10. CURE (Clustering Using Representatives):

It identifies clusters by shrinking each cluster to a certain number of representative points rather than the centroid.

Thread image

Matt Dancho (Business Science)

@mdancho84

EVERY DATA SCIENTIST NEEDS TO LEARN AI IN 2025.

99% of data scientists are overlooking AI.

I want to help.

Matt Dancho (Business Science)

@mdancho84

On Wednesday, May 21st, I'm sharing one of my best AI Projects: Customer Segmentation Agent with AI

Register here (500 seats): learn.business-science.io/ai-register

Thread image

Generated by Thread Navigator

100%

view_carousel Carousel Studio NEW

Press ⌘ + S to quick-export