R:如何简化重新运行的输出(从咕噜咕噜)?

(这是我在Stackflow上的第一个问题：我希望我提出它是正确的(

我正在使用重新运行(从整洁的咕噜声(来重复一些计算。这是一个非常简单的例子(它可能看起来很荒谬，但它说明了这一点(

library(tidyverse)
# Function to do the calculation 
do_rerun <- function(data_in){
data_out <- data_in %>%   
group_by(id) %>%
transmute(result = do_calculation(x1, x2)) %>% 
ungroup()
return(data_out)
}
# Some test data
(test_data <- tibble(id = c("1","2","3","1","2","3","1","2","3"), 
day = c(1,1,1,2,2,2,3,3,3), 
x1 = runif(9), 
x2 = runif(9)) %>% 
arrange(id,day))
# A tibble: 9 x 4
id      day     x1     x2
<chr> <dbl>  <dbl>  <dbl>
1 1         1 0.195  0.0854
2 1         2 0.884  0.0863
3 1         3 0.539  0.240 
4 2         1 0.696  0.262 
5 2         2 0.752  0.663 
6 2         3 0.477  0.252 
7 3         1 0.0387 0.494 
8 3         2 0.286  0.589 
9 3         3 0.0249 0.870 
# Do the calculation .n = 3 times
# The output of reun is a list,
# which in this case is a list of 3 unnamed tibbles
# each of which has an id and result column
(test <- rerun(.n = 3, do_rerun(test_data)))
# Output
[[1]]
# A tibble: 9 x 2
id    result
<chr>  <dbl>
1 1     0.0167
2 1     0.0763
3 1     0.129 
4 2     0.182 
5 2     0.499 
6 2     0.121 
7 3     0.0191
8 3     0.168 
9 3     0.0217
[[2]]
# A tibble: 9 x 2
id    result
<chr>  <dbl>
1 1     0.0167
2 1     0.0763
3 1     0.129 
4 2     0.182 
5 2     0.499 
6 2     0.121 
7 3     0.0191
8 3     0.168 
9 3     0.0217
[[3]]
# A tibble: 9 x 2
id    result
<chr>  <dbl>
1 1     0.0167
2 1     0.0763
3 1     0.129 
4 2     0.182 
5 2     0.499 
6 2     0.121 
7 3     0.0191
8 3     0.168 
9 3     0.0217

我想将这个包含三个 tibble 的列表转换为一个包含 id(来自第一个 tibble(的单个 tibble，后跟 result1、result2、result3(即来自三个 tibbles 中每个 tible 的结果字段(。我可以通过以下方式访问单个列

id_tibble <- as_tibble(test[[1]][["id"]])

和

result_tibble <- as_tibble(test[[1]][["result"]])

我想做的(至少对于结果列(是这样的：

new_tibble <- as_tibble(test[[1:3]][["result"]])

但它会抛出错误("test[[1：3]][["result"]] ：下标越界"(。

我想获得的最终结构是：

id    result1 result2 result3
<chr> <dbl>   <dbl>   <dbl>
1     0.0167  0.0167  0.0167
1     0.0763  0.0763  0.0763
1     0.129   0.129   0.129 
2     0.182   etc.

也许做到这一点的方法是使用 purrr 中的 map 命令(或其变体之一(，但如果我能弄清楚，我就！

下面是一个快速解决方案：创建一个大型数据帧并删除重复的id列：

test %>%
purrr::map_dfc(cbind) %>%
dplyr::select(-matches("id.+"))

(编辑：在下面添加了替代方案以保持在DPLYR内;即，这将导致tibble(

test %>%
dplyr::bind_cols() %>%
dplyr::select(-matches("id.+"))

我假设您知道在您的示例中所有三个结果都是相同的并且在实际问题中结果是不同的。我也假设您希望重新运行分析超过 3 次。(如果我弄错了，请告诉我(

相关内容

最新更新

热门标签：