在R中使用read_csv、col_select和id参数时出现异常错误



当我试图将id参数read_csvcol_select()一起使用时,我遇到了一个奇怪的错误,但前提是我以非连续的方式(例如c(1:2,4((使用列的数字位置引用列。

特别是当使用readr::read_csv从文件路径导入多个csv文件并使用id参数通过以非连续方式引用位置处的列来使用col_select参数时,则会出现错误。

然而,如果我们删除id参数,则按位置引用列而不使用"0";间隙";或者甚至按名称引用列,那么它就会起作用。

有办法解决这个问题吗?还是这是个bug?

请参阅以下内容:

#pretend this file file path has at least two csv files in it
files <- "insert_file_path"
#this will not work, notice there is a "gap" in the refernece (no column 3)
##the error suggest it can't subset columns that don't exist
read_csv(files,
id = "source",
col_select = c(1:2,4),#notice that I do not pull in the 3rd column
)
#this will work now that I references without any "gaps"
read_csv(files,
id = "source",
col_select = c(1:3,4) #however it will work when I pull in columns in a continuous way
)
#howewever if we remove the "id" argument, the below code will work (even though there is a "gap")
read_csv(files,
#id = "source",
col_select = c(1:2,4),#however this will work (pulls in different columns though)
)

我已经在readr上提交了以下问题,它被标记为bug。

https://github.com/tidyverse/readr/issues/1395

最新更新