;"未来";库非常适合在您的机器上同时运行多个任务,并充分利用您的核心。我正试图找到一种方法,将任务数n_task作为参数,并自动生成多个任务。我希望帖子末尾的代表能让你了解我想要什么。这应该是一个写一些文本作为表达式的问题,由R来评估,但到目前为止,我一直在用头撞墙。欢迎提出任何建议。谢谢
library(tibble)
library(future)
split_vec <- function(d, mylen){
res <- split(d, ceiling(seq_along(d)/mylen))
return(res)
}
set.seed(1234)
nn <- 12000
df <- tibble(x=seq(0,12, length=nn), y=3*x+rnorm(nn))
##what I do in the following, i.e. fitting chunks of data separately, may not make statistical sense but it is just an example to illustrate what I am after
tt <- split_vec(seq(nn), 2000)
plan(multiprocess(workers=2))
fit1 %<-% {lm(y[tt[[1]]]~x[tt[[1]]], data=df)}
fit2 %<-% {lm(y[tt[[2]]]~x[tt[[2]]], data=df)}
## Is there a way to select e.g. the number of tasks n_task=4 and automatically calculate
## fit3 %<-% {lm(y[tt[[3]]]~x[tt[[3]]], data=df)}
## fit4 %<-% {lm(y[tt[[4]]]~x[tt[[4]]], data=df)}
## without writing this out explicitly?
由reprex包于2020-07-15创建(v0.3.0(
您可以使用furrr
包:
library(furrr)
tt %>% future_map(~lm(y~x, data=df[.x,]))
$`1`
Call:
lm(formula = y ~ x, data = df[.x, ])
Coefficients:
(Intercept) x
-0.04129 3.03526
$`2`
Call:
lm(formula = y ~ x, data = df[.x, ])
Coefficients:
(Intercept) x
0.1534 2.9521
...