删除r中字符和空行之间的单词



我有一个数据框架,其中一列充满单元格,看起来像这样:

"***ORDER LIST***nCustomer: Lucillenitem1: applesnitem2: oranges"
"***ORDER LIST***nCustomer: Frank and Sallynitem1: winenitem2: milk"
"***ORDER LIST***nnnitem1: winenitem2: milk"

我正在尝试对每个单元格进行消毒,删除以单词Customer开头的整行,或者如果不存在,则删除第一个空白行。

我希望最终得到这样的经过处理的文本数据:

"***ORDER LIST***nitem1: applesnitem2: oranges"
"***ORDER LIST***nitem1: winenitem2: milk"
"***ORDER LIST***nitem1: winenitem2: milk"

使用gsub是否有一种方法可以摆脱空白行和包含Customer的整行?

感谢

试试这样写:

text<-c("***ORDER LIST***nCustomer: Lucillenitem1: applesnitem2: oranges",
"***ORDER LIST***nCustomer: Frank and Sallynitem1: winenitem2: milk",
"***ORDER LIST***nnnitem1: winenitem2: milk")

gsub("Customer: .*?\n|\n\n", " ", text)

[1] "***ORDER LIST***n item1: applesnitem2: oranges" "***ORDER LIST***n item1: winenitem2: milk"     
[3] "***ORDER LIST*** nitem1: winenitem2: milk"     

这对你有用吗?

gsub("(.*\*).*?(nitem.*)", "\1\2", text)
[1] "***ORDER LIST***nitem1: applesnitem2: oranges" "***ORDER LIST***nitem1: winenitem2: milk"     
[3] "***ORDER LIST***nitem1: winenitem2: milk"

最新更新