r-对分组数据中的字符串列使用any()和all()



我有一个分组的数据,我想根据每一行的值创建一个新的变量。

> df <- data.frame(Group = c("A","A","A","B","B","B", "C", "C"), Gender=c("M","M","F","F","F","F", "M", "M"))
>   df
Group Gender
1     A      M
2     A      M
3     A      F
4     B      F
5     B      F
6     B      F
7     C      M
8     C      M 

在这个例子中,我现在想知道A、B和C的组是否是

  • 仅限男性:小组所有成员均为男性
  • 仅限女性:小组所有成员均为女性
  • 混合性别:小组中有男性和女性

所以想要的结果是:

Group Gender  gender_mix
1     A      M       Mixed
2     A      M       Mixed
3     A      F       Mixed
4     B      F Female Only
5     B      F Female Only
6     B      F Female Only
7     C      M   Male Only
8     C      M   Male Only

我尝试使用any((和all((,但没有成功:

>   df%>%
+     group_by(Group)%>%
+     mutate(gender_mix=case_when(all(Gender)=="M"~"Male Only",
+                                 all(Gender)=="F"~"FemAle Only",
+                                 any(Gender)=="M"&any(Gender)=="F"~"Mixed",
+                                 TRUE~NA_character_))
# A tibble: 8 × 3
# Groups:   Group [3]
Group Gender gender_mix
<chr> <chr>  <chr>     
1 A     M      NA        
2 A     M      NA        
3 A     F      NA        
4 B     F      NA        
5 B     F      NA        
6 B     F      NA        
7 C     M      NA        
8 C     M      NA        
There were 12 warnings (use warnings() to see them)
> warnings()
Warning messages:
1: Problem while computing `gender_mix = case_when(...)`.
ℹ coercing argument of type 'character' to logical
ℹ The warning occurred in group 1: Group = "A".
2: Problem while computing `gender_mix = case_when(...)`.
ℹ coercing argument of type 'character' to logical
ℹ The warning occurred in group 1: Group = "A".
3: Problem while computing `gender_mix = case_when(...)`.
ℹ coercing argument of type 'character' to logical
ℹ The warning occurred in group 1: Group = "A".
4: Problem while computing `gender_mix = case_when(...)`.
ℹ coercing argument of type 'character' to logical
ℹ The warning occurred in group 1: Group = "A".
5: Problem while computing `gender_mix = case_when(...)`.
ℹ coercing argument of type 'character' to logical
ℹ The warning occurred in group 2: Group = "B".
6: Problem while computing `gender_mix = case_when(...)`.
ℹ coercing argument of type 'character' to logical
ℹ The warning occurred in group 2: Group = "B".
7: Problem while computing `gender_mix = case_when(...)`.
ℹ coercing argument of type 'character' to logical
ℹ The warning occurred in group 2: Group = "B".
8: Problem while computing `gender_mix = case_when(...)`.
ℹ coercing argument of type 'character' to logical
ℹ The warning occurred in group 2: Group = "B".
9: Problem while computing `gender_mix = case_when(...)`.
ℹ coercing argument of type 'character' to logical
ℹ The warning occurred in group 3: Group = "C".
10: Problem while computing `gender_mix = case_when(...)`.
ℹ coercing argument of type 'character' to logical
ℹ The warning occurred in group 3: Group = "C".
11: Problem while computing `gender_mix = case_when(...)`.
ℹ coercing argument of type 'character' to logical
ℹ The warning occurred in group 3: Group = "C".
12: Problem while computing `gender_mix = case_when(...)`.
ℹ coercing argument of type 'character' to logical
ℹ The warning occurred in group 3: Group = "C".

另一个问题是,我的数据相当大(10M行(,any((和all((似乎非常慢。

任何帮助都将不胜感激。

您应该将整个条件放在括号之间:

df %>% 
group_by(Group) %>% 
mutate(gender_mix = case_when(all(Gender == "M") ~ "Male only",
all(Gender == "F") ~ "Female only",
any(Gender == "F") & any(Gender == "M") ~ "Mixed",
TRUE ~ NA_character_))

最新更新