r语言 - 具有子组的组的编号反向引用



我有"fan(s("这个词,我想用"狂热者"这个词代替,前面有一个代词动词组合,如下所示。

gsub(
"(((s?he( i|')s)|((you|they|we)( a|')re)|(I( a|')m)).{1,20})(\b[Ff]an)(s?\b)", 
'\1\2atic\3', 
'He's the bigest fan I know.', 
perl = TRUE, ignore.case = TRUE
)
## [1] "He's the bigest He'saticHe's I know."

我知道编号的反向引用是指第一组的内括号。 有没有办法让他们只引用三个组所在的外三个括号:伪代码中的(stuff before fan)(fan)(s\b)

我知道我的正则表达式可以替换 wll 组 si 我知道它是有效的。 这只是反向引用部分。

gsub(
"(((s?he( i|')s)|((you|they|we)( a|')re)|(I( a|')m)).{1,20})(\b[Ff]an)(s?\b)", 
'', 
'He's the bigest fan I know.', 
perl = TRUE, ignore.case = TRUE
)
## [1] " I know."

期望输出:

## [1] "He's the bigest fanatic I know."

匹配示例

inputs <- c(
"He's the bigest fan I know.",
"I am a huge fan of his.",
"I know she has lots of fans in his club",
"I was cold and turned on the fan",
"An air conditioner is better than 2 fans at cooling."
)

outputs <- c(
"He's the bigest fanatic I know.",
"I am a huge fanatic of his.",
"I know she has lots of fanatics in his club",
"I was cold and turned on the fan",
"An air conditioner is better than 2 fans at cooling."
)

我知道您对捕获组过多感到困惑。将你不感兴趣的那些变成不捕获的,或者删除那些完全多余的:

((?:s?he(?: i|')s|(?:you|they|we)(?: a|')re|I(?: a|')m).{1,20})b(Fan)(s?)b

查看正则表达式演示

请注意,[Ff]可以变成Ff,因为您ignore.case=TRUE参数。

R 演示:

gsub(
"((?:s?he(?: i|')s|(?:you|they|we)(?: a|')re|I(?: a|')m).{1,20})\b(fan)(s?)\b", 
'\1\2atic\3', 
inputs, 
perl = TRUE, ignore.case = TRUE
)

输出:

[1] "He's the bigest fanatic I know."                     
[2] "I am a huge fanatic of his."                         
[3] "I know she has lots of fans in his club"             
[4] "I was cold and turned on the fan"                    
[5] "An air conditioner is better than 2 fans at cooling."

相关内容

  • 没有找到相关文章

最新更新