我在R上编码,如果b列中的文本以SX o DX结束,我会找到一个分隔的代码。必须维护最终部件(SX DX(,并将其添加到新的列中
a = read.table(text="
Num X b
12 3 "ab SX"
13 35 "sd DX"
14 35 "dh af SX"
15 10 "sd"",h=T)
Result = read.table(text="
Num X b DXSX
12 3 "ab" SX
13 35 "sd" DX
14 35 "dh af" SX
15 10 "sd" 0",h=T)
我们可以使用separate
library(tidyr)
separate(a, b, into = c("b", "DXSX"), sep = "\s+(?=[A-Z]{2})", fill = "right")
-输出
Num X b DXSX
1 12 3 ab SX
2 13 35 sd DX
3 14 35 dh af SX
4 15 10 sd <NA>
数据
a <- structure(list(Num = c(12, 13, 14, 15), X = c(3, 35, 35, 10),
b = c("ab SX", "sd DX", "dh af SX", "sd")),
class = "data.frame", row.names = c(NA,
-4L))