r语言 - 将混合数据类型的新点分配给现有 PAM 生成的群集



>我正在尝试将新数据分配给现有集群,我有数字和分类数据类型。下面的示例类似于我的流程。数据框"newdf"是我想分配给 PAM 集群的点,我将如何在 R 中对其进行编码?任何帮助表示赞赏,谢谢。

set.seed(1680)
library(dplyr) 
library(ISLR) 
library(cluster) 
college_clean <- College %>%
mutate(name = row.names(.),
accept_rate = Accept/Apps,
isElite = cut(Top10perc,
breaks = c(0, 50, 100),
labels = c("Not Elite", "Elite"),
include.lowest = TRUE)) %>%
mutate(isElite = factor(isElite)) %>%
select(name, accept_rate, Outstate, Enroll,
Grad.Rate, Private, isElite)

gower_dist <- daisy(college_clean[,-1],
metric = "gower",
type = list(logratio = 3))
pam_fit <- pam(gower_dist, diss = TRUE, k = 3)

newdf=data.frame(name=c("x_university","y_university","z_university"),
accept_rate=c(.73,.50,.98),Outstate=c(10000,15000,5000),
Enroll=c(500,1000,200),Grad.Rate=c(80,65,73),
Private=c("Yes","No","No"),isElite=c("Elite","Not Elite", "Elite"))

这真的很简单,就去做吧。

到聚类中心点的距离的参数。

最新更新