我使用fs::dir_ls(path, regexp = paste0("LATO-", "[[:alpha:]]*", ".csv$"))
来获取特定目录下的文件列表。
默认情况下,文件按字母顺序排序。在R中有没有办法让它们按照我的模式排序
lto - style .csv, lto -lut.csv, lto -mar.csv, lto - kvi .csv, lto - major .csv, lto -cze.csv, lto -lip.csv
lto -si .csv, lto -wrz.csv, lto -paz.csv, lto - liscsv, lto -gru.csv您所描述的模式不是我们可以应用于文件清单中所有可能值的通用模式。然而,我们可以确保如果这些特定的值出现在你的向量中,它们会被排序到前面:
示例fs::dir_ls()
data:
files <- c('some/dir/LATO-bar.csv', 'some/dir/LATO-baz.csv', 'some/dir/LATO-foo.csv',
'some/dir/LATO-kwi.csv', 'some/dir/LATO-lut.csv', 'some/dir/LATO-sty.csv',
'some/dir/ZLATO-bar.csv', 'some/dir/ZLATO-baz.csv')
代码:
order <- c('LATO-sty.csv', 'LATO-lut.csv', 'LATO-mar.csv', 'LATO-kwi.csv',
'LATO-maj.csv', 'LATO-cze.csv', 'LATO-lip.csv', 'LATO-sie.csv',
'LATO-wrz.csv', 'LATO-paz.csv', 'LATO-lis.csv', 'LATO-gru.csv')
# get `files` present in `order`
set1 <- files[fs::path_file(files) %in% order] # extract filenames
ids <- match(fs::path_file(set1), order) # get matching IDs from `order`
ids_sorted <- sort(ids, index.return=T) # get sort order
set1_sorted <- set1[ids_sorted$ix] # apply sort order
# get `files` NOT present in `order`, keep them in the same order
set2 <- files[!fs::path_file(files) %in% order]
# join sets
result <- unname(c(set1_sorted, set2))
结果:
> result
[1] "some/dir/LATO-sty.csv" "some/dir/LATO-lut.csv" "some/dir/LATO-kwi.csv" "some/dir/LATO-bar.csv" "some/dir/LATO-baz.csv"
[6] "some/dir/LATO-foo.csv" "some/dir/ZLATO-bar.csv" "some/dir/ZLATO-baz.csv"
如果您有按字母顺序排序的名称,
x
# [1] "LATO-cze.csv" "LATO-gru.csv" "LATO-kwi.csv" "LATO-lip.csv" "LATO-lis.csv" "LATO-lut.csv"
# [7] "LATO-maj.csv" "LATO-mar.csv" "LATO-paz.csv" "LATO-sie.csv" "LATO-sty.csv" "LATO-wrz.csv"
你可以很容易地根据一个最小的模式对它们进行排序
pattern <- c("sty", "lut", "mar", "kwi", "maj", "cze", "lip", "sie", "wrz", "paz",
"lis", "gru")
或者如果您想键入整个模式并且认为这样更安全,
pattern <- c("LATO-sty.csv", "LATO-lut.csv", "LATO-mar.csv", ...)
usinggrep
.
sapply(pattern, grep, x)
# sty lut mar kwi maj cze lip sie wrz paz lis gru
# 11 6 8 3 7 1 4 10 12 9 5 2
sapply(pattern, grep, x, value=TRUE) ## use `value=TRUE` to check if it's right
# sty lut mar kwi maj cze
# "LATO-sty.csv" "LATO-lut.csv" "LATO-mar.csv" "LATO-kwi.csv" "LATO-maj.csv" "LATO-cze.csv"
# lip sie wrz paz lis gru
# "LATO-lip.csv" "LATO-sie.csv" "LATO-wrz.csv" "LATO-paz.csv" "LATO-lis.csv" "LATO-gru.csv"
为了最终对列表lst
进行排序,我们简单地用grep
的结果对它进行子集。
lst[sapply(pattern, grep, x)]
# $`LATO-sty.csv`
# list()
#
# $`LATO-lut.csv`
# list()
#
# $`LATO-mar.csv`
# list()
#
# $`LATO-kwi.csv`
# list()
#
# $`LATO-maj.csv`
# list()
#
# $`LATO-cze.csv`
# list()
#
# $`LATO-lip.csv`
# list()
#
# $`LATO-sie.csv`
# list()
#
# $`LATO-wrz.csv`
# list()
#
# $`LATO-paz.csv`
# list()
#
# $`LATO-lis.csv`
# list()
#
# $`LATO-gru.csv`
# list()
数据:
x <- c("LATO-cze.csv", "LATO-gru.csv", "LATO-kwi.csv", "LATO-lip.csv",
"LATO-lis.csv", "LATO-lut.csv", "LATO-maj.csv", "LATO-mar.csv",
"LATO-paz.csv", "LATO-sie.csv", "LATO-sty.csv", "LATO-wrz.csv"
)
lst <- setNames(replicate(12, list()), x)