Mean-shift is a:
- Non-parametric clustering algorithm → no need to know number of clusters k in advance.
- Hierarchical method.
- Also known as:
- Mode seeking algorithm
- Sliding window algorithm
Widely used in:
- Image processing
- Computer vision
How it works:
- The algorithm uses a window (kernel) that slides over data points.
- The window radius is called the bandwidth.
- A common choice of kernel is the Gaussian window.
- Purpose: Find dense regions (modes) in the dataset.
Step-by-step (Algorithm ):
Step 1: Choose a window (kernel) with a given bandwidth.
Step 2: Place the window on a data point.
Step 3: Compute the mean of all data points inside the window.
Step 4: Move the window center to this mean.
- This shift is called the mean shift vector.
- Formula:

where:
- K: number of points inside the window.
- xi: data points within the radius.
Step 5: Repeat steps 3–4 until:
- Convergence (window stops moving).
- No more new points can be included.
Result: windows converge to modes → each mode becomes a cluster.
Advantages:
- No assumptions about data distribution.
- Works for non-convex cluster shapes.
- Only needs one parameter: bandwidth.
- Robust to noise.
- Avoids local minima or premature convergence.
Disadvantages:
- Choosing the right bandwidth is difficult:
- Too large → merges different clusters.
- Too small → detects too many clusters & may not converge properly.
- Number of clusters cannot be specified by the user.
Why it’s called mean-shift:
- Each window shifts toward the region with the highest density (the “mode”)
- The cluster centers are the final positions where the windows converge.