这是我想将数据附加到其中的文件
Collection A
Jan
Feb
March
April
Collection B
Jan
Feb
March
April
Revenue A
Jan
Feb
March
April
Revenue B
Jan
Feb
March
April
我要从中提取数据的文件如下所示:
Collection Month Collection A Collection B Revenue Month Revenue A Revenue B
Collection January 1 5 Revenue January 4 8
Collection February 2 6 Revenue February 3 7
Collection March 3 7 Revenue March 2 6
Collection April 4 8 Revenue April 1 5
我希望最终输出如下所示:
Collection A
Jan 1
Feb 2
March 3
April 4
Collection B
Jan 5
Feb 6
March 7
April 8
Revenue A
Jan 4
Feb 3
March 2
April 1
Revenue B
Jan 8
Feb 7
March 6
April 5
我能够使用偏移量和间接函数在 excel 上运行它。但是我想更好地自动化它以备将来使用,所以我正在 R 上尝试它。
我真的坚持如何结合两个数据集以找到所需的输出。这对我来说似乎是一项不可能完成的任务。我已经尝试了几个功能,如选择、子集和排列,它们都没有帮助我进步。
如果有人能帮助我解决这个问题,我会很高兴。
这是实现该输出的一种方法。请注意,我从示例数据的列名中删除了空格,以便更轻松地读取到 R 中。您没有指定希望输出数据帧的列名称是什么,因此它们几乎没有意义。
library(tidyverse)
tbl <- read_table2(
"Collection Month CollectionA CollectionB Revenue Month RevenueA RevenueB
Collection January 1 5 Revenue January 4 8
Collection February 2 6 Revenue February 3 7
Collection March 3 7 Revenue March 2 6
Collection April 4 8 Revenue April 1 5"
)
#> Warning: Duplicated column names deduplicated: 'Month' => 'Month_1' [6]
tbl %>%
select(-Collection, -Revenue, -Month_1) %>%
gather(variable, value, -Month) %>%
group_by(variable) %>%
group_modify(~ add_row(.x, Month = .y$variable, value = NA, .before = 1)) %>%
ungroup() %>%
select(-variable)
#> # A tibble: 20 x 2
#> Month value
#> <chr> <dbl>
#> 1 CollectionA NA
#> 2 January 1
#> 3 February 2
#> 4 March 3
#> 5 April 4
#> 6 CollectionB NA
#> 7 January 5
#> 8 February 6
#> 9 March 7
#> 10 April 8
#> 11 RevenueA NA
#> 12 January 4
#> 13 February 3
#> 14 March 2
#> 15 April 1
#> 16 RevenueB NA
#> 17 January 8
#> 18 February 7
#> 19 March 6
#> 20 April 5
创建于 2019-06-18 由 reprex 软件包 (v0.3.0(