R：使用量词"+"进行奇怪的正则表达式匹配？

嗨，假设我有这样一个术语。

temp = "Big Satchel - Bird - turquoise"

我想删除最后一个"-">

所以我首先用这个命令来测试它，它按预期工作。

stringi::stri_replace_last(temp, regex = '-...', '')
[1] "Big Satchel - Bird rquoise"

然而，

> stringi::stri_replace_last(temp, regex = '-.+$', '')
[1] "Big Satchel "
> stringi::stri_replace_last(temp, regex = '-.+?$', '')
[1] "Big Satchel "

那么，为什么当我没有量词时，它找到并删除了最后一个匹配的词，但在其他方面都失败了呢？我最终想做的是把它打印出来。

Charming Satchel - Bird

我们可以使用[^-]+来匹配一个或多个不是-的字符。.是一个可以匹配任何字符的元字符。因此，在OP的帖子中，它匹配了第一个-，然后是一个或多个所有其他字符

stringi::stri_replace_last(temp, regex = '\s*-[^-]+$', '')
#[1] "Big Satchel - Bird"

使用当前语法，我们可以用另一个stri_replace包装以获得预期的

stringi::stri_replace(stringi::stri_replace_last(temp, 
regex = '\s*-[^-]+$', ''), regex = '\w+', 'Charming')
#[1] "Charming Satchel - Bird"

或使用单个stri_replace

stringi::stri_replace(temp, regex = "^\w+(\s+.*)\s+-[^-]+$",  "Charming$1")
#[1] "Charming Satchel - Bird"

相关内容