R-如何在循环中使字符串str_extract模式参数化



我有用正斜杠分割的字符串,我试图使用循环生成它,所以我需要参数化正则表达式,以便在循环中使用它。我有7个级别:

我想使用正则表达式和stringi:提取以下内容

A
A/268
A/268/200
A/268/200/300
A/268/200/300/400

这是我的:

n=3
str_extract("A/268/200/300/400/500","(.*?/){n}"

str_extract("A/268/200/300/400/500","(.*?/){3}"

我们可以使用glue::glue来插值值

n <- 3
pat <- as.character(glue::glue("(.*?/){<-n-1->}([^/]+)", 
.open = "<-", .close = "->"))
pat
#[1] "(.*?/){2}([^/]+)"
library(stringr)
str_extract("A/268/200/300/400/500", pat)
#[1] "A/268/200"

如果我们需要它作为一个循环

v1 <- 1:7
lst1 <- vector('list', length(v1))
for(i in v1) {
tmppat <- as.character(glue::glue("(.*?/){<-i-1->}([^/]+)",
.open = "<-", .close = "->"))
lst1[[i]] <- str_extract("A/268/200/300/400/500", tmppat)
}


head(lst1, 5)
#[[1]]
#[1] "A"
#[[2]]
#[1] "A/268"
#[[3]]
#[1] "A/268/200"
#[[4]]
#[1] "A/268/200/300"
#[[5]]
#[1] "A/268/200/300/400"

base中使用regex以及for循环:

for (n in 1:lengths(regmatches("A/268/200/300/400/500" , 
gregexpr("/", "A/268/200/300/400/500")))) {
print(gsub(paste0("^(?:[^/]*\K/){",n,"}.*"), "", "A/268/200/300/400/500", perl = TRUE))
}
#> [1] "A"
#> [1] "A/268"
#> [1] "A/268/200"
#> [1] "A/268/200/300"
#> [1] "A/268/200/300/400"

第一个解决方案:

base中,我们可以制作regex模式并更改n(可能在for循环中(以提取所需结果:

N <- lengths(regmatches("A/268/200/300/400/500" , gregexpr("/", "A/268/200/300/400/500")))
n <- 3
strsplit("A/268/200/300/400/500",paste0("([^/]+)(?:/[^/]+){",N-n,"}$"))
#> [[1]]
#> [1] "A/268/200/"

最新更新