Is there any easy way to return the furthermost outlier after sklearn kmeans clustering?
Essentially I want to make a list of the biggest outliers for a load of clusters. Unfortunately I need to use sklearn.cluster.KMeans due to the assignment.
Is there any easy way to return the furthermost outlier after sklearn kmeans clustering?
Essentially I want to make a list of the biggest outliers for a load of clusters. Unfortunately I need to use sklearn.cluster.KMeans due to the assignment.
K-means is not well suited for "outlier" detection.
k-means has a tendency to make outliers a one-element cluster. Then the outliers have the smallest possible distance and will not be detected.
K-means is not robust enough when there are outliers in your data. You may actually want to remove outliers prior to using k-means.
Use rather something like kNN, LOF or LoOP instead.