我试图运行此代码,但遇到错误:ValueError:标签数为1。有效值为2到n_samples-1(包括2和n_samples(。此外,据说:亲和性传播没有收敛,该模型将没有任何聚类中心。这两个问题的解决方案是什么
preferences = range(-15000,-5000,50) # arbitraty chosen range
af_sil_score = [] # silouette scores
for p in preferences:
AF = AffinityPropagation(preference=p, max_iter=200).fit(X)
no_of_clusters.append((len(np.unique(AF.labels_))))
af_sil_score.append(silhouette_score(X, AF.labels_))
af_results = pd.DataFrame([preferences, no_of_clusters, af_sil_score], index=['preference','clusters', 'sil_score']).T
af_results.sort_values(by='sil_score', ascending=False).head() # display only 5 best scores```
对于您的某些偏好值,它不适合您的数据集,并且不会收敛(给您带来收敛错误(。当算法失败时,它不会返回标签,应用silhouette_score
(第二个错误(也没有意义。
您可以查看此页面或其他帖子,了解如何设置首选项值。或者,您可以简单地尝试一系列值,当集群数量低于2时,只需返回-1。
Using an example :
from sklearn.cluster import AffinityPropagation
from sklearn.metrics import silhouette_score
from sklearn.datasets import make_blobs
import pandas as pd
X, labels_true = make_blobs( n_samples=250, random_state=0, n_features =5, centers = 3 )
preferences = range(-1000,-100,100)
af_sil_score = [] # silouette scores
no_of_clusters = []
for p in preferences:
AF = AffinityPropagation(preference=p, max_iter=200).fit(X)
no_of_clusters.append(len(AF.cluster_centers_))
if len(AF.cluster_centers_) > 1:
af_sil_score.append(silhouette_score(X, AF.labels_))
else :
af_sil_score.append(-1)
结果是这样的,你可以在上面的示例数据集中看到它开始在-500左右工作:
pd.DataFrame({'preferences':preferences,'no_of_clusters':no_of_clusters,'sil':af_sil_score})
preferences no_of_clusters sil
0 -1000 0 -1.000000
1 -900 0 -1.000000
2 -800 0 -1.000000
3 -700 0 -1.000000
4 -600 0 -1.000000
5 -500 3 0.716733
6 -400 3 0.716733
7 -300 3 0.716733
8 -200 3 0.716733