我想从Name
列中提取标题(Mr, Mrs, Miss),并将这些提取的标题导入到新列Title
中。相关数据如下:
snippet <- data_frame(Name=c('Braund, Mr. Owen Harris','Cumings, Mrs. John Bradley','Heikkinen, Miss. Laina'),Column=c('blah','blah,'blah'))
我已经复习了这个答案,但是我一定遗漏了什么。
这是我能想到的最好的代码:snippet <- mutate(snippet, Title = str_extract(snippet $Name, "(?<=,)[^,]*(?=.)")
。这确实添加了Title
列,但是该列中的所有值都是NA。我的错误在哪里?谢谢。也许这有帮助-在列'Name'中,,
之后有一个空格,所以我们使用regex查找来匹配在,
和空格((?<=, )
)之后成功的非空白字符(\S+
),并在.
之前(.
是元字符,所以我们转义,否则它匹配任何字符)
library(dplyr)
library(stringr)
snippet <- snippet %>%
mutate(Title = str_extract(Name, "(?<=, )\S+(?=\.)"))
与产出
snippet
# A tibble: 3 × 3
Name Column Title
<chr> <chr> <chr>
1 Braund, Mr. Owen Harris blah Mr
2 Cumings, Mrs. John Bradley blah Mrs
3 Heikkinen, Miss. Laina blah Miss
数据snippet <- structure(list(Name = c("Braund, Mr. Owen Harris",
"Cumings, Mrs. John Bradley",
"Heikkinen, Miss. Laina"), Column = c("blah", "blah", "blah")),
class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -3L))