I'm clustering some data using scikit.
I have the easiest possible task: I do know the number of clusters. And, I do know the size of each cluster. Is it possible to specify this information and relay it to the K-means function?
I'm clustering some data using scikit.
I have the easiest possible task: I do know the number of clusters. And, I do know the size of each cluster. Is it possible to specify this information and relay it to the K-means function?
No. You need some type of constrained clustering algorithm to do this, and none are implemented in scikit-learn. (This is not "the easiest possible task", I wouldn't even know of a principled algorithm that does this, aside from some heuristic moving of samples from one cluster to another.)