如何根据R中三列中的值是否相等来创建新变量



我有一个大的数据帧,其中三个变量具有以下结构:

author1_gender <- c("Men", "Men", "Women")
author2_gender <- c("Women", "Men", "Women")
author3_gender <- c("Men", "Men", "Women")
genders <- tibble(author1_gender, author2_gender, author3_gender)

它产生

# A tibble: 3 × 3
author1_gender author2_gender author3_gender
<chr>          <chr>          <chr>         
1 Men            Women          Men           
2 Men            Men            Men           
3 Women          Women          Women         

我希望根据行中是否存在混合性别来创建一个新列,即每行中的三个值是否相等。理想情况下,我希望在三列中添加一列,表明是否只有女性、只有男性或混合性别,即

# A tibble: 3 × 4
author1_gender author2_gender author3_gender gender_mix
<chr>          <chr>          <chr>          <chr>     
1 Men            Women          Men            mix       
2 Men            Men            Men            men       
3 Women          Women          Women          women  

如果我有两个值,我可以用identital()来做这件事,但我似乎找不到如何用三个值来做。有人能帮我解决这个可能很琐碎的问题吗?

您可以在名称以"gender"结尾的列中找到每行的最小值和最大值,如果最小值等于最大值,则返回最大值,否则返回"mix"。

library(dplyr, warn.conflicts = FALSE)
author1_gender <- c("Men", "Men", "Women")
author2_gender <- c("Women", "Men", "Women")
author3_gender <- c("Men", "Men", "Women")
genders <- tibble(author1_gender, author2_gender, author3_gender)
genders %>% 
mutate(
gender_mix =  
lapply(c(pmax, pmin), do.call, across(ends_with('gender'))) %>% 
{if_else(Reduce('==', .), .[[1]], 'mix')}
)
#> # A tibble: 3 × 4
#>   author1_gender author2_gender author3_gender gender_mix
#>   <chr>          <chr>          <chr>          <chr>     
#> 1 Men            Women          Men            mix       
#> 2 Men            Men            Men            Men       
#> 3 Women          Women          Women          Women

创建于2021-12-07由reprex包(v2.0.1(

如果您有NA,您可以将na.rm = TRUE参数添加到pminpmax

genders %>% 
mutate(
gender_mix =  
lapply(c(pmax, pmin), do.call, 
c(across(ends_with('gender')),  na.rm = TRUE)) %>% 
{if_else(Reduce('==', .), .[[1]], 'mix')}
)
genders %>% mutate(gender_mix=ifelse(pmin(author1_gender, author2_gender, author3_gender)==pmax(author1_gender, author2_gender, author3_gender),author1_gender, "mix"))

# A tibble: 3 x 4
author1_gender author2_gender author3_gender gender_mix
<chr>          <chr>          <chr>          <chr>     
1 Men            Women          Men            MIX       
2 Men            Men            Men            Men       
3 Women          Women          Women          Women  

最新更新