我正在运行这段代码,以确定我需要与K原型聚类的集群数量,我得到这个错误
PlotnineError: "无法计算'x'映射:'群集'(原始错误:名称'群集'未定义)">
# Choose optimal K using Elbow method
cost = []
for cluster in range(1, 10):
try:
kprototype = KPrototypes(n_jobs = -1, n_clusters = cluster, init = 'Huang', random_state = 0)
kprototype.fit_predict(dfMatrix, categorical = catColumnsPos)
cost.append(kprototype.cost_)
print('Cluster initiation: {}'.format(cluster))
except:
break
# Converting the results into a dataframe and plotting them
a = {'Cluster':range(1, 6), 'Cost':cost}
df_cost = pd.DataFrame.from_dict(a, orient='index')
df_cost.transpose()
# Data viz
plotnine.options.figure_size = (8, 4.8)
(
ggplot(data = df_cost)+
geom_line(aes(x = 'Cluster',
y = 'Cost'))+
geom_point(aes(x = 'Cluster',
y = 'Cost'))+
geom_label(aes(x = 'Cluster',
y = 'Cost',
label = 'Cluster'),
size = 10,
nudge_y = 1000) +
labs(title = 'Optimal number of cluster with Elbow Method')+
xlab('Number of Clusters k')+
ylab('Cost')+
theme_minimal()
)
有人知道吗?
你在数据转换代码中有一个疏忽。
这条线
df_cost.transpose()
应该
df_cost = df_cost.transpose()