r语言 - 如何从字符串中删除编号换行符? - r - How to remove numbered newlines from string? 小贝子编程网

我正在清理一些文本数据，我遇到了一个与删除换行文本相关的问题。对于这个数据，文本中不仅有n字符串，还有nn字符串，以及编号的换行，如:n2和nn2。后者是我的问题。如何使用正则表达式删除这个?

我在r中工作，这里是一些示例文本和我使用的，到目前为止:

#string
string <- "There is a square in the apartment. nn4Great laughs, which I hear from the other room. 4 laughs. Several. 9 times ten.n2"
#code attempt
gsub("[r\n0-9]", '', string)

这个regex代码的问题是它删除了数字并与字母n匹配。

我希望有以下输出:

"There is a square in the apartment. Great laughs, which I hear from the other room. 4 laughs. Several. 9 times ten."

我使用regexr作为参考。

像这样编写模式[r\n0-9]匹配回车、字符或n之一或数字0-9

您可以编写匹配1个或多个回车或换行符的模式，后跟可选数字:

[rn]+[0-9]*

的例子:

string <- "There is a square in the apartment. nn4Great laughs, which I hear from the other room. 4 laughs. Several. 9 times ten.n2"
gsub("[rn]+[0-9]*", '', string)

输出

[1] "There is a square in the apartment. Great laughs, which I hear from the other room. 4 laughs. Several. 9 times ten."

查看R演示。

r语言 - 如何从字符串中删除编号换行符?

相关内容

最新更新

热门标签：