R 中字符串向量的自定义缩写



我正在尝试从数据框gss中提取的向量degree_abrev中的一些字符串进行自定义缩写。

这就是我能够想到的...但我想看看是否有人有"更漂亮"的方式......

degree_abrev <- gsub("Lt High School", "LtHS", gss$degree)
degree_abrev <- gsub("High School", "HS", degree_abrev)
degree_abrev <- gsub("Junior College", "JC", degree_abrev)
degree_abrev <- gsub("Bachelor", "B", degree_abrev)
degree_abrev <- gsub("Graduate", "G", degree_abrev)

"plyr" 包具有 "mapvalues" 函数来执行此操作。我相信一定有其他方法可以做到这一点。

> degree_abbrev <- c("Lt High School", "High School", "Junior College", 
"Bachelor", "Graduate")
> degree_abbrev
[1] "Lt High School" "High School"    "Junior College" "Bachelor"       
"Graduate"      
> degree_abbrev <- mapvalues(degree_abbrev, from = c("Lt High School", "High 
School", "Junior College", "Bachelor", "Graduate"), to = c("LtHS", "HS", 
"JC", "B", "G"))
> degree_abbrev
[1] "LtHS" "HS"   "JC"   "B"    "G"

我不知道这是否更漂亮,但我更喜欢使用 sapply。

degree_abrev <- c("Lt High School", "High School", "Junior College", "Bachelor", "Graduate")
sapply(strsplit(degree_abrev, " "), function(x){paste(substring(x, 1, 1), collapse = "")})
[1] "LHS" "HS"  "JC"  "B"   "G"  

最新更新