对r中有多个因子的函数使用l-apply

在数据框df中，我构造了一个函数f，它计算x.sample和y.sample之间的相关性。然后，我试图运行999个随机化，计算per中每个随机化的预期相关性。我不确定这里的lapply是否写得正确，如果它实际上是在per函数中。如果我计算的是per，而不是4个sp上的任何其他函数，那么验证这一点的简单方法是什么?

set.seed(111)
library(truncnorm)
x <- rtruncnorm(n = 288,a = 0,b = 10,mean = 5,sd = 2)
v <- rtruncnorm(n = 288,a = 0,b = 10,mean = 5,sd = 2)
y <- ((v/x^2) - (1/x))
sp <- rep(c("A","B","C","D"), each = 72)
df <- data.frame(v,x,y,sp)
library(data.table)
setDT(df)
# function to estimate model coefficients
f <- function(x,v) {x.sample <- sample(x, length(x), replace=T)
y.sample <- (v/x.sample^2) - (1/x.sample)
per <- cor(y.sample, x.sample)}
set.seed(1234)
# 999 models for each species
result = rbindlist(
lapply(1:999, (i) df[,.(est = f(x,v)), sp][, i:=i])
)

我对data.table不是很熟悉，所以lapply()调用中的符号看起来很外国，但在玩弄它之后，我认为它正在做你想做的。

检查的一种方法是用不同的方式编码，并直观地比较结果:

library(dplyr)
out_list <- list()
for(i in 1:999){
df2 <- df %>%
group_by(sp) %>%
mutate(est = f(x, v)) %>%
select(sp, est) %>%
distinct()
out_list[[i]] <- df2
}
out_df <- bind_rows(out_list, .id = "id")
library(ggplot2)
p1 <- ggplot() + 
geom_histogram(data = out_df, mapping = aes(x = est, color = sp, fill = sp), show.legend = FALSE) + 
facet_wrap(~sp)
p2 <- ggplot() +
geom_histogram(data = result, mapping = aes(x = est, color = sp, fill = sp), show.legend = FALSE) +
facet_wrap(~sp)
gridExtra::grid.arrange(p1, p2, ncol = 2)

相关内容

最新更新

热门标签：