如何在R中组合来自多个excel工作簿的多个工作表?



我每周都会收到相同的数据,这些数据在excel选项卡上具有相同的名称。我需要首先将所有具有相同名称的表分组,并在将它们组合在一起之前对它们进行操作。例如:

Week1.xlsx (workbook) 
- Blue (sheet) - has a column named type that just says "Color"
- Red (sheet) - has a column named type that just says "Color"
- Green (sheet)` - has a column named type that just says "Color"

Week2.xlsx (workbook) 
- Blue (sheet) - has a column named type that just says "Color"
- Red (sheet) - has a column named type that just says "Color"
- Green (sheet) - has a column named type that just says "Color"

等。

我需要将所有常见的表组合在一起,并修改type列,以说明它来自哪个颜色表。如果我将它们组合在一起,那么type列将只使用& Color&quot填充,因此工作表的名称是告诉我如何组织它的唯一信息。

最终输出:all-data.xlsx

Type
Color - Blue
Color - Red
Color - Blue
Color - Green
.
.
.

我不一定需要第二部分(修改列)的帮助,我只是想知道如何将所有常见的工作表组合在一起(可能在一个列表中),我可以通过名称访问。

EDIT我不知道如何把它标记为副本,但我只是提交给审查。更多答案请参阅下面的帖子。

好吧,我从这篇文章中找到了一个答案,完美地总结了这个问题。具体来说,.id = 'sheet_name'部分通过将每一行分配给工作表的名称来实现我的目标。

library(tidyverse)
library(readxl)
dir_path <- "~/test_dir/"         # target directory path where the xlsx files are located. 
re_file <- "^test[0-9]\.xlsx"    # regex pattern to match the file name format, in this case 'test1.xlsx', 'test2.xlsx' etc, but could simply be 'xlsx'.
read_sheets <- function(dir_path, file){
xlsx_file <- paste0(dir_path, file)
xlsx_file %>%
excel_sheets() %>%
set_names() %>%
map_df(read_excel, path = xlsx_file, .id = 'sheet_name') %>% 
mutate(file_name = file) %>% 
select(file_name, sheet_name, everything())
}
df <- list.files(dir_path, re_file) %>% 
map_df(~ read_sheets(dir_path, .))

最新更新