当我试图将id
参数read_csv
与col_select()
一起使用时,我遇到了一个奇怪的错误,但前提是我以非连续的方式(例如c(1:2,4((使用列的数字位置引用列。
特别是当使用readr::read_csv从文件路径导入多个csv文件并使用id
参数和通过以非连续方式引用位置处的列来使用col_select
参数时,则会出现错误。
然而,如果我们删除id
参数,则按位置引用列而不使用"0";间隙";或者甚至按名称引用列,那么它就会起作用。
有办法解决这个问题吗?还是这是个bug?
请参阅以下内容:
#pretend this file file path has at least two csv files in it
files <- "insert_file_path"
#this will not work, notice there is a "gap" in the refernece (no column 3)
##the error suggest it can't subset columns that don't exist
read_csv(files,
id = "source",
col_select = c(1:2,4),#notice that I do not pull in the 3rd column
)
#this will work now that I references without any "gaps"
read_csv(files,
id = "source",
col_select = c(1:3,4) #however it will work when I pull in columns in a continuous way
)
#howewever if we remove the "id" argument, the below code will work (even though there is a "gap")
read_csv(files,
#id = "source",
col_select = c(1:2,4),#however this will work (pulls in different columns though)
)
我已经在readr上提交了以下问题,它被标记为bug。
https://github.com/tidyverse/readr/issues/1395