如何将字符串分隔为R中的必需部分



我使用以下代码创建了一个list.files

#Make a list of the files
files <- list.files(path="E:\ICAR PDF\Data\Tridip\Desktop\TSEB", 
pattern=glob2rx("*.tif$*"), full.names=TRUE)
files
[1] "E:\ICAR PDF\Data\Tridip\Desktop\TSEB/TSEB_03.APR.2022_ET.tif"
[2] "E:\ICAR PDF\Data\Tridip\Desktop\TSEB/TSEB_03.MAR.2022_ET.tif"
[3] "E:\ICAR PDF\Data\Tridip\Desktop\TSEB/TSEB_04.DEC.2021_ET.tif"
[4] "E:\ICAR PDF\Data\Tridip\Desktop\TSEB/TSEB_06.FEB.2022_ET.tif"
[5] "E:\ICAR PDF\Data\Tridip\Desktop\TSEB/TSEB_10.MAR.2022_ET.tif"
[6] "E:\ICAR PDF\Data\Tridip\Desktop\TSEB/TSEB_11.APR.2022_ET.tif"
[7] "E:\ICAR PDF\Data\Tridip\Desktop\TSEB/TSEB_13.DEC.2021_ET.tif"
[8] "E:\ICAR PDF\Data\Tridip\Desktop\TSEB/TSEB_15.FEB.2022_ET.tif"
[9] "E:\ICAR PDF\Data\Tridip\Desktop\TSEB/TSEB_18.NOV.2021_ET.tif"
[10] "E:\ICAR PDF\Data\Tridip\Desktop\TSEB/TSEB_19.MAR.2022_ET.tif"
[11] "E:\ICAR PDF\Data\Tridip\Desktop\TSEB/TSEB_20.DEC.2021_ET.tif"
[12] "E:\ICAR PDF\Data\Tridip\Desktop\TSEB/TSEB_22.FEB.2022_ET.tif"
[13] "E:\ICAR PDF\Data\Tridip\Desktop\TSEB/TSEB_26.MAR.2022_ET.tif"
[14] "E:\ICAR PDF\Data\Tridip\Desktop\TSEB/TSEB_30.JAN.2022_ET.tif"

现在我只想提取所有14行的TSEB_03.APR.2022_ET部分(第7部分),并制作如下

c("TSEB_03.APR.2022_ET", "TSEB_03.MAR.2022_ET", "TSEB_04.DEC.2021_ET", 
"TSEB_06.FEB.2022_ET", "TSEB_10.MAR.2022_ET", "TSEB_11.APR.2022_ET", 
"TSEB_13.DEC.2021_ET", "TSEB_15.FEB.2022_ET", "TSEB_18.NOV.2021_ET", 
"TSEB_19.MAR.2022_ET", "TSEB_20.DEC.2021_ET", "TSEB_22.FEB.2022_ET", 
"TSEB_26.MAR.2022_ET", "TSEB_30.JAN.2022_ET")

我该怎么做?

您可以使用basename+gsub:

x <- c("E:\ICAR PDF\Data\Tridip\Desktop\TSEB/TSEB_03.APR.2022_ET.tif",
"E:\ICAR PDF\Data\Tridip\Desktop\TSEB/TSEB_03.MAR.2022_ET.tif")
gsub("\.tif$", "", basename(x))
#[1] "TSEB_03.APR.2022_ET" "TSEB_03.MAR.2022_ET"

stringrstr_extract的另一个解决方案:

library(stringr)
x <- c("E:\ICAR PDF\Data\Tridip\Desktop\TSEB/TSEB_03.APR.2022_ET.tif",
"E:\ICAR PDF\Data\Tridip\Desktop\TSEB/TSEB_03.MAR.2022_ET.tif")
str_extract(x, "(?<=\/)(.*)(?=.tif)")

Regex解释:

我们需要扫射任何东西,捕获组(.*),有两个条件第一个条件是向后看捕获组(?<=),对于/,我使用\来转义字符第二个条件,在捕获组(?=)之前查看扩展.tif

输出:

[1] "TSEB_03.APR.2022_ET" "TSEB_03.MAR.2022_ET"

为了扩展我的注释以避免regex,我们已经知道文件所在的文件夹,并且它们都在同一个文件夹下,所以获取没有文件夹的文件名:

myFolder = "E:\ICAR PDF\Data\Tridip\Desktop\TSEB/"
files <- list.files(path = myFolder, pattern=glob2rx("*.tif$*")) # default full.names = FALSE

然后阅读你的文件,比如:

dataList <- lapply(paste0(myFolder, files), myLoadRasterFunction)

如果我们需要文件名,在文件对象中没有regex,它们就很容易获得。

相关内容

  • 没有找到相关文章