r-使字符向量的重复元素唯一，但与make.unique()不同

当多个引用具有相同的作者和出版年份时，通常在年份后包含小写字母。我正在寻找一个优雅的功能：

# what I have
have <- c("Dawkins (2008)",
"Dawkins (2008)",
"Stephenson (2008)")
# what I want
want <- c("Dawkins (2008a)",
"Dawkins (2008b)",
"Stephenson (2008)")
# this would do the job, but is not really what I want
make.unique(have)
#> [1] "Dawkins (2008)"    "Dawkins (2008).1"  "Stephenson (2008)"

^{创建于2022-02-24由reprex包(v2.0.1(}

编辑：基于@akrun以下答案的解决方案

library(dplyr)
library(stringr)
have <- c("Dawkins (2008)",
"Dawkins (2008)",
"Stephenson (2008)")
f <- function(x){
v1 <- ave(x, x, FUN = function(x) if(length(x) > 1) letters[seq_along(x)] else "")
stringr::str_replace(x, "\)", stringr::str_c(v1, ")"))
}
data.frame(ha = have) %>% 
mutate(want = f(ha))
#>                  ha              want
#> 1    Dawkins (2008)   Dawkins (2008a)
#> 2    Dawkins (2008)   Dawkins (2008b)
#> 3 Stephenson (2008) Stephenson (2008)

^{创建于2022-02-24由reprex包(v2.0.1(}

我们可以根据重复项的length提取letters(假设重复项长度不大于26(，然后使用str_replace在结束)之前插入字母

library(stringr)
v1 <- ave(have, have, FUN = function(x) 
if(length(x) > 1) letters[seq_along(x)] else "")
str_replace(have, "\)", str_c(v1, ")"))
[1] "Dawkins (2008a)"   "Dawkins (2008b)"   "Stephenson (2008)"

编辑：基于@akrun以下答案的解决方案

相关内容

最新更新

热门标签：