无法为一维数据绘制K-Means聚类



我正试图在二进制分类任务中实现K-Means算法,但我无法绘制得到的两个聚类的散点图。

我的数据集只是以下形式:

# size, class
312,  1
319   1
227   0       

最小的例子:

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.cluster         import KMeans
X = {'size': [312,319,227,301,273,311,277,291,303,381], 'class': [1,1,0,1,0,1,0,0,1,1]}
X = pd.DataFrame(data=X)
X_train, X_test, y_train, y_test = train_test_split(X['size'], X['class'], test_size=0.4)
X_train = X_train.values.reshape(-1,1)
X_test  = X_test.values.reshape(-1,1)
kmeans = KMeans(init="k-means++", n_clusters=2, n_init=10, max_iter=300, random_state=42)
kmeans.fit(X_train)
preds = kmeans.predict(X_test)

我如何绘制显示两个聚类的散点图;X_ test";以及根据预测的对应颜色(对于0和1("0";preds";?

由于您只有一个功能,因此所有数据都在一行中。你可以创建这样的散点图:

color = ["blue", "red"]
plt.scatter(X_test.flatten(), [0]*len(X_test), c=[color[p] for p in preds])

如果你想有两个功能,你可以修改你的数据:

X = {
'size_1': [312,319,227,301,273,311,277,291,303,381],
'size_2': [152,165,301,145,310,145,315,156,160,165],
'class': [1,1,0,1,0,1,0,0,1,1],
}
X = pd.DataFrame(data=X)
X_train, X_test, y_train, y_test = train_test_split(X[['size_1', 'size_2']], X['class'], test_size=0.4)

然后修改散点图:

plt.scatter(X_test.iloc[:,0],X_test.iloc[:,1], c=[color[p] for p in preds])

最新更新