R从数据帧的字符串中删除单词



假设我有以下数据集:

Date_Received = c("Addition 1/2/2018", "Swimming Pool 1/8/2018", "Abandonment 1/9/2018", "Existing Approval 3/14/2018", "Holding Tank 5/11/2018")
Date_Approved = c("1/2/2018", "1/8/2018", "1/9/2018", "SB 3/21/2018", "JW 5/11/2018")

并且我想在Date_Received列中删除characters之前的date,以便稍后我可以使用lubridate将其转换为date类型的数据格式。

我尝试使用以下代码,但它只删除first大写字母。

我该如何解决这个问题?

所需输出:

Date_Received Date_Approved 
1/2/2018      1/2/2018
1/8/2018      1/8/2018
1/9/2018      1/9/2018
3/14/2018     SB 3/21/2018
5/11/2018     JW 5/11/2018

代码
library(tidyverse)
df = data.frame(Date_Received, Date_Approved)
df= df%>% mutate(Date.Received = trimws(Date_Received, whitespace = "[A-Z]*\s*")) %>% filter(nzchar(Date.Received)) 

我们可以使用trimws,它有一个whitespace参数(正如您在代码中使用的),可用于指定空白。

library(dplyr)
df %>% 
mutate(Date_Received = trimws(Date_Received, "left", "\D"))

或与str_replace_all:

library(stringr)
df %>% 
mutate(Date_Received = str_replace_all(Date_Received, "^\D+", ""))

Date_Received Date_Approved
1      1/2/2018      1/2/2018
2      1/8/2018      1/8/2018
3      1/9/2018      1/9/2018
4     3/14/2018  SB 3/21/2018
5     5/11/2018  JW 5/11/2018

另一个选项使用sub:

df$Date_Received <- sub("^\D+", "", df$Date_Received)

生活要简单:

Date_Received = c("Addition 1/2/2018", "Swimming Pool 1/8/2018", "Abandonment 1/9/2018", "Existing Approval 3/14/2018", "Holding Tank 5/11/2018")
stringr::word(Date_Received, -1)
[1] "1/2/2018"  "1/8/2018"  "1/9/2018"  "3/14/2018" "5/11/2018"

相关内容

  • 没有找到相关文章

最新更新