你能帮我更好地理解下面我看到的这段代码吗?看到有一些属性的信息,也使用了hclust
函数。但是我不理解输出p = 12
,它代表什么?为这些数据生成的最大簇数是多少?你能帮我理解吗?
library(geosphere)
Points_properties<-structure(list(Propertie=c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29), Latitude = c(-24.781624, -24.775017, -24.769196,
-24.761741, -24.752019, -24.748008, -24.737312, -24.744718, -24.751996,
-24.724589, -24.8004, -24.796899, -24.795041, -24.780501, -24.763376,
-24.801715, -24.728005, -24.737845, -24.743485, -24.742601, -24.766422,
-24.767525, -24.775631, -24.792703, -24.790994, -24.787275, -24.795902,
-24.785587, -24.787558), Longitude = c(-49.937369,
-49.950576, -49.927608, -49.92762, -49.920608, -49.927707, -49.922095,
-49.915438, -49.910843, -49.899478, -49.901775, -49.89364, -49.925657,
-49.893193, -49.94081, -49.911967, -49.893358, -49.903904, -49.906435,
-49.927951, -49.939603, -49.941541, -49.94455, -49.929797, -49.92141,
-49.915141, -49.91042, -49.904772, -49.894034)), row.names = c(NA, -29L), class = c("tbl_df", "tbl",
"data.frame"))
coordinates<-subset(Points_properties,select=c("Latitude","Longitude"))
d<-distm(coordinates[,2:1])
d<-as.dist(d)
fit.average<-hclust(d,method="average")
p<-1
clusters<-cutree(fit.average, p)
nclusters<-matrix(table(clusters))
while (min(nclusters)>1) {
p<-p+1
clusters<-cutree(fit.average, p)
nclusters<-matrix(table(clusters))}
p<-p-1
> p
[1] 12
听起来p
是最小的集群数量,它将给你至少一个只有一个成员的组。
nclusters <- matrix(table(clusters))
nclusters
将每个簇的成员数存储为一个矩阵。
while (min(nclusters)>1) {
当nclusters
为1时,while循环停止。