"Finding Top relevant document in kmeans cluster"
data:image/s3,"s3://crabby-images/e9e37/e9e376f86fc989f8be36462752cae2b4a4f55b06" alt="amir_askary_sha"
data:image/s3,"s3://crabby-images/7371c/7371cabaeb0bab47310576cbbb2ad0922c241e63" alt=""
Hi,
After running kmeans clustering, how can I find out which document is the most relevant (top document) in one cluster?
Right now the documents in a cluster are sorted ascendingly by their id. I want to have them sorted by a weight score showing how relevant this document is in this cluster, or at least to see the most relevant doc in the cluster.
Tagged:
0
Answers
Hi,
你如何定义相关性吗?
Best,
Martin
Dortmund, Germany
I don't know exactly; any kind of relevancy. For example let's say every cluster has some top words in it (the centroids that kmeans finds), and then the document which has the shortest cosine/euclidian distance to those top words of the cluster, is the most relevant doc in the cluster.