我是R中循环的新手,我需要编写多个嵌套循环的帮助。我有一个数据框,其中一行表示区域内一个地点的物种计数。有50个区域,区域之间的站点数量不相等。对于每个区域,我需要根据增量增加站点数量来计算多样性指数,并为每个增量步骤复制 1000 倍。例如:
R1 <- subset(df, region=="1") #this needs to be completed for all 50 regions
R1$region<-NULL
max<-nrow(R1)-1
iter <- 1000 #the number of iterations
n <- 1 # the number of rows to be sampled. This needs to increase until
“max”
outp <- rep(NA, iter)
for (i in 1:iter){
d <- sample(1:nrow(R1), size = n, replace=FALSE)
bootdata <- R1[d,]
x <- colSums(bootdata) #this is not applicable until n>1
outp[i] <- 1/diversity(x, index = "simpson")
}
下面是一个示例数据集
structure(list(region = c(1L, 1L, 1L, 2L, 2L, 3L, 4L, 4L), Sp1 = c(31L,
85L, 55L, 71L, 81L, 22L, 78L, 64L), Sp2 = c(10L, 84L, 32L, 86L,
47L, 93L, 55L, 35L), Sp3 = c(86L, 56L, 4L, 8L, 55L, 47L, 51L,
95L)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-8L), .Names = c("region", "Sp1", "Sp2", "Sp3"), spec = structure(list(
cols = structure(list(region = structure(list(), class =
c("collector_integer",
"collector")), Sp1 = structure(list(), class = c("collector_integer",
"collector")), Sp2 = structure(list(), class = c("collector_integer",
"collector")), Sp3 = structure(list(), class = c("collector_integer",
"collector"))), .Names = c("region", "Sp1", "Sp2", "Sp3")),
default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))
简而言之,对于每个地区,我需要计算每个站点的"辛普森"指数,随机重新采样1000次。然后,我需要在每列相加 1000 次后再次计算 2 个站点的索引。然后 3 个站点等,直到最大。
我也在编写输出时遇到困难。我希望每个区域都有一个数据帧,其中的列表示 n 的 1000 次迭代,直到最大值。
提前非常感谢
您可以编写一次处理泛型区域的函数。然后,按区域将数据拆分为一个列表,并使用 sapply
将自定义函数应用于每个列表元素。
bootstrapByRegion <- function(R) {
rgn <- unique(R$region)
message(sprintf("Processing %s", rgn))
R$region <- NULL
nmax <- nrow(R)-1
if (nmax == 0) stop(sprintf("Trying to work on one row. No dice. Manually exclude region %s or handle otherwise.", rgn))
iter <- 1000 #the number of iterations
# pre-allocate the result
output <- matrix(NA, nrow = iter, ncol = nmax)
for (i in 1:nmax) {
i <- 1
output[, i] <- replicate(iter, expr = {
d <- sample(1:nrow(R), size = i, replace=FALSE)
bootdata <- R[d, , drop = FALSE]
x <- colSums(bootdata) #this is not applicable until n>1
outp <- 1/diversity(x, index = "simpson")
outp
})
}
output
}
xy <- split(df, f = df$region)
result <- sapply(xy, FUN = bootstrapByRegion) # list element is taken as R
由于区域 3 只有一行,因此它不起作用(因为nrow(R)-1
(。您可以通过多种方式排除这些区域。这是其一。
result <- sapply(xy[sapply(xy, nrow) > 1], FUN = bootstrapByRegion)