R字符串分离问题



我正在处理几个字符串,如下所示

Col1
--------------------------
554 - partial-completion_3
4011 - structure painted
5459 - 1 int mam-corrosion issue
996 - cast iron countershock

我的目标是将这些字符串分为两个部分

Col1_Id   Col2_Desc
--------------------------
554       partial-completion_3
4011      structure painted
5459      1 int mam-corrosion issue
996       cast iron countershock

我尝试使用seperate函数

df_sep =   df %>% 
  separate(Col1, c("Col1_ID", "Col2_Desc"), "-")

仅在字符串中只有一个 - 的情况下,如果有两个 - 例如在字符串中

       `5459 - 1 int mam-corrosion issue`

然后独立函数在第二个 - 之后删除描述,并且输出看起来像

       `5459 - 1 int mam` 

这不是我所期望的。我期望像这样的输出

    Col1_Id   Col2_Desc
    --------------------------
    554       partial-completion_3
    4011      structure painted
    5459      1 int mam-corrosion issue
    996       cast iron countershock

任何提示或建议都非常感谢。

我们可以使用sub,替换第一个-,然后使用read.csv

读取
read.csv(text= sub("-", ",", df1$Col1), header=FALSE, 
          col.names=c("Col1_Id",   "Col2_Desc"), stringsAsFactors=FALSE)
#   Col1_Id                  Col2_Desc
#1     554       partial-completion_3
#2    4011          structure painted
#3    5459  1 int mam-corrosion issue
#4     996     cast iron countershock

separate的情况下,有一个extra参数,可用于整理此问题

library(tidyr)
separate(df1, Col1, into = c("Col1_Id", "Col2_Desc"), extra="merge")
#  Col1_Id                 Col2_Desc
#1     554      partial-completion_3
#2    4011         structure painted
#3    5459 1 int mam-corrosion issue
#4     996    cast iron countershock

数据

df1 <- structure(list(Col1 = c("554 - partial-completion_3", "4011 - structure painted", 
"5459 - 1 int mam-corrosion issue", "996 - cast iron countershock"
)), .Names = "Col1", class = "data.frame", row.names = c(NA, 
-4L))

一个基本r替代方案是 strsplit,将列分为列表,然后使用 rbind.data.frame构造data.frame。SetNames用于方便地在同一行中设置名称。

setNames(do.call(rbind.data.frame, strsplit(df1$Col1, split=" - ")),
         c("Col1_Id", "Col2_Desc"))
  Col1_Id                 Col2_Desc
1     554      partial-completion_3
2    4011         structure painted
3    5459 1 int mam-corrosion issue
4     996    cast iron countershock

最新更新