r-使用子集循环ggplot



希望基于列中的唯一名称创建我的数据子集(Congener(,并将用特定名称保存的ggplot循环到我的工作目录中。

这段代码所做的是构建一个绘图并输出子集名称,然而数据本身并没有被细分为唯一的名称。还有一些列(threshold、CHMS、Weber(的值我想为其创建一条水平线。并不是所有的唯一名称都有CHMS或Weber值,但希望这不会影响代码。

为什么数据没有被细分?我还可以为4个不同列的所有组合循环这个吗(例如,我有Congener、Age、Gender、Ethnicity列(?

# load ggplot2
library(ggplot2)
library(hrbrthemes)
library(tidyverse)
library(dplyr) # for data manipulation
#set wd to location where plots will be saved
setwd('')
#getting all the congeners that will be looped over
congener_list = unique(dt$Congener)
#Creating an empty list to save plots crated. Lists in R are very versatile.
#They can pretty much store any type of data in them
quantile_plots = list()
#looping over unique congener names
for (i in congener_list) {
dt %>% filter(Congener == i) %>%

quantile_plots[[i]] = ggplot(aes(x = Dataset, y = y, color = Type)) +
labs(y = 'ng/g Lipid Weight') +
ggtitle(congener_list[i]) +
scale_y_continuous(trans = 'log10')+
geom_jitter(
data = dt,
shape = 19,
alpha = 0.5,
size = 4
) 

quantile_plots[[i]] +
geom_hline(aes(yintercept= threshold), color = 'red', size = 1.2) +
geom_hline(aes(yintercept= CHMS), color = 'orange') +
geom_hline(aes(yintercept= Weber), color = 'pink', size = 1.2)

print(quantile_plots[[i]])

#save the plots to disk
ggsave(quantile_plots[[i]], file=paste0('Quantile_plot_', i, '.png'),
width = 44.45, height = 27.78, units = 'cm', dpi = 600)
}

理想的输出在这里。目前的情况是标题和名称是正确的,但抖动的数据是所有数据,而不是特定的子集。样品

调整ggplot以将整个管道对象分配到列表中,您可以在使用<-之后或之前使用反向分配->来执行此操作。此外,删除数据参数,以便geom_jitter从plot:继承子集数据

分配时间:

dt %>% filter(Congener == i) %>%
ggplot(aes(x = Dataset, y = y, color = Type)) +
labs(y = 'ng/g Lipid Weight') +
ggtitle(congener_list[i]) +
scale_y_continuous(trans = 'log10') +
geom_jitter(
shape = 19,
alpha = 0.5,
size = 4
) -> quantile_plots[[i]]

分配之前:

quantile_plots[[i]] <- dt %>% 
filter(Congener == i) %>%
ggplot(...) +
...

然而,考虑by根据各种因素对数据帧进行切片,然后运行通用绘图函数:

构建绘图功能

添加tryCatch以在有问题的运行中捕获错误,打印消息并返回NULL以不中断循环

build_plot <- function(sub_df) {
tryCatch({
group_title <- with(
sub_df, 
paste(Congener[1], Age[1], Gender[1], Ethnicity[1], sep = "_")
)
plt <- sub_df %>%
ggplot(aes(x = Dataset, y = y, color = Type)) +
labs(y = 'ng/g Lipid Weight') +
ggtitle(group_title) +
scale_y_continuous(trans = 'log10') +
geom_jitter(
shape = 19, alpha = 0.5, size = 4
) 

# CONDITIONALLY ADD HLINES IF COLUMNS EXIST
if("intercept" %in% names(sub_df)) {
plt <- plt + geom_hline(aes(yintercept= threshold), color = 'red', size = 1.2)
}
if("CHMS" %in% names(sub_df)) {
plt <- plt + geom_hline(aes(yintercept= CHMS), color = 'orange')
}
if("Weber" %in% names(sub_df)) {
plt <- plt + geom_hline(aes(yintercept= Weber), color = 'pink', size = 1.2)
}

# print plot in console
print(plt)

# save plot to disk
ggsave(
plt, file = paste0(group_title, ".png"), width = 44.45, height = 27.78, units = 'cm', dpi = 600
)
return(plt)
}, error = function(e) { print(e); return(NULL) }
)
}

按组呼叫

根据运行的组调整函数中的group_title

# ONE GROUP
quantile_plots <- by(dt, dt$Congener, build_plot)
# TWO GROUPS
quantile_plots <- by(dt, dt[c("Congener", "Age")], build_plot)
# THREE GROUPS
quantile_plots <- by(dt, dt[c("Congener", "Age", "Gender")], build_plot)
# FOUR GROUPS
quantile_plots <- by(dt, dt[c("Congener", "Age", "Gender", "Ethnicity")], build_plot)

最新更新