创建具有唯一ID的新列,以处理r中2列的重复项



我有一个数据框架,其中包含在特定日期针对特定主题进行的调查。看起来像这样:

library(data.table)
library(tidyverse)
df <- fread("subject_id day data
01  1   34
01  1   58
02  1   54
03  3   55
04  4   56") %>% 
tibble()
df
#> # A tibble: 5 × 3
#>   subject_id   day  data
#>        <int> <int> <int>
#> 1          1     1    34
#> 2          1     1    58
#> 3          2     1    54
#> 4          3     3    55
#> 5          4     4    56
df %>% 
group_by(subject_id) %>% 
mutate(name = ifelse(order(day) == 1, 
as.character(order(day)), 
paste(day, "-", order(day)))
)
#> # A tibble: 5 × 4
#> # Groups:   subject_id [4]
#>   subject_id   day  data name 
#>        <int> <int> <int> <chr>
#> 1          1     1    34 1    
#> 2          1     1    58 1 - 2
#> 3          2     1    54 1    
#> 4          3     3    55 1    
#> 5          4     4    56 1

由reprex包(v2.0.1)创建于2022-06-23

试试这个

df |> group_by(subject_id , day) |> 
mutate(name = ifelse( duplicated(day) , paste0(day , "-" , cumsum(duplicated(day))+1)
, as.character(day)))
df %>%
group_by(day) %>%
mutate(name = janitor::make_clean_names(subject_id, use_make_names = FALSE))

# A tibble: 5 x 4
# Groups:   day [3]
subject_id   day  data name 
<int> <int> <int> <chr>
1          1     1    34 1    
2          1     1    58 1_2  
3          2     1    54 2    
4          3     3    55 3    
5          4     4    56 4 

如果survey是您的数据作为data.frame

duplicatedresults<-duplicated(surveys[,c("subject_id", "day")])
surveys$name[duplicatedresults]<-paste0(surveys$subject_id[duplicatedresults],"-2")

相关内容

  • 没有找到相关文章

最新更新