集群的 T-SNE 代码文本标记

我使用此代码运行t-sne。我想对整个数据框进行 t-sne 那么有没有办法标记我正在聚类的点，并用不同的颜色标记它们以使它们在视觉上可区分。

这些是我CMP_6792" "CMP_7256" "CMP_7653" "GMP_6792" "GMP_7256" "GMP_7653" "HSC_6792" "HSC_7256" "HSC_7653" "Mono_6792" "Mono_7256" "Mono_7653" "Gran1" "Gran2的样本我想根据上述示例标记我的观点。

这是我的代码

file1<- read.csv('PRIMARY_CELL_EPILIST.csv')
head(file1)
names(file1)
class(file1)
dat <- data.frame(file1)
rownames(file1) <- make.names(file1[,1], unique = TRUE)
head(file1)
dim(file1)
data <- file1[,2:15]
head(data)
library(tsne)
tsne1 <- tsne(scale(data), perplexity = 10,max_iter = 300)
plot(tsne1[, 1], tsne1[, 2])
library(ggplot2)
plotdata <- data.frame(tsne_x = tsne1[, 1], tsne_y = tsne1[, 2])
plt1 <- ggplot(plotdata) + geom_point(aes(x = tsne_x, y = tsne_y))
plot(plt1)

因此，任何帮助或建议以及对我的代码的改进将不胜感激。

首先需要对 t-SNE 结果进行聚类。然后，群集分配将用作颜色分配。

cl <- cluster::pam( tsne1 )

修改plotdatadata.frame，使其包含所有内容(样本名称、t-SNE 坐标、群集分配(：

plotdata <- data.frame( tsne_x = tsne1[,1], tsne_y = tsne1[,2], SampleID = v,
Cluster = cl$clustering )

其中v是您提供的样本名称的向量(即，v <- c( "CMP_6792", "CMP_7256", "CMP_7653", ... )或v <- rownames(tsne1)如果可用(。

最后，调整您的ggplot调用以访问data.frame中的相关列：

plt1 <- ggplot( plotdata, aes( x = tsne_x, y = tsne_y, color = Cluster ) +
geom_point() + ggrepel::geom_text_repel( aes( label = SampleID ) )

相关内容

最新更新

热门标签：