将文本作为指定列传递,以使用R中readr中的read_csv以[type]打开



我想打开一些.csv文件,将默认列类型指定为"i〃;表示整数。然而,某些文件也有特定的列,我想告诉readr::read_csv用定义的类型打开(哪些列的逻辑无关紧要,假设我知道哪些列用于哪些文件(

有没有一种方法可以将这些列传递到read_csvcol_types参数中,同时仍然保持每隔一列都应该使用整数类型打开

df <- data.frame(
a = c(1,2,3,4),
b = sample(1:100, 4),
c_text = c("hi", "I", "am", "text"),
d_decimals = runif(4),
e_more_text = c("another", "text", "column", "lol")
)
readr::write_csv(df, "/path/to/csv/file.csv")
character_cols <- c("c_text", "e_more_text")
double_cols <- "d_decimals"
data <- readr::read_csv(
"/path/to/csv/file.csv",
# supply something here to determine column types
col_types = cols(.default = "i", character_cols = "c", double_cols = "d")
)

由于计算哪些列应该是字符或双字符等的逻辑性,我理想情况下会将它们作为名称的矢量提供

干杯

您可以制作一个助手函数,将您的额外规范与默认列规范相结合,然后将规范与do.call结合在一起。

extra_spec = list(
"c_text" = "c",
"d_decimals" = "i",
"e_more_text" = "c"
)
read_csv_with_default_int = function(path, extra_spec) {
readr::read_csv(path, col_types = do.call(cols, c(extra_spec, list(.default = col_integer()))))
}
read_csv_with_default_int("file.csv", extra_spec = extra_spec)

你也可以用这样的助手来避免大量的嵌套逻辑

cols_default_int = purrr::partial(cols, .default = col_integer())
read_csv_with_default_int = function(path, col_types) {
readr::read_csv(path, col_types = do.call(cols_default_int, col_types))
}
read_csv_with_default_int("file.csv", col_types = extra_spec)

最新更新