r-在dplyr中突变2个新列中TRUE或FALSE的比例



我在R中有下表:

条件ATRUEATRUEAFALSEATRUEBTRUEBTRUEBTRUEBFALSE

您也可以简单地将meanlogical变量一起使用:

library(dplyr)
data3 %>% 
as_tibble() %>% # this converts your matrix into a tibble
mutate(cond = as.logical(cond)) %>% # convert character to logical
group_by(cat) %>% 
summarise("TRUE" = mean(cond),
"FALSE" = mean(!cond))

输出:

# A tibble: 2 x 3
cat   `TRUE` `FALSE`
<chr>  <dbl>   <dbl>
1 A       0.75    0.25
2 B       0.5     0.5 

dplyr之外,使用prop.tabletable:非常容易做到这一点

with(as.data.frame(data3), prop.table(table(cat, cond), 1))
cond
cat FALSE TRUE
A  0.25 0.75
B  0.50 0.50

或者,更简单(归功于@G.Grothendieck(,使用xtabs:

prop.table(xtabs(~., data3), 1)

这样?

library(tidyverse)
cat = c(rep("A",4),rep("B",4))
cond = c("TRUE","TRUE","FALSE","TRUE","FALSE","TRUE","TRUE","FALSE")
data3 = data.frame(cat,cond)
data3 %>%
group_by(cat) %>%
summarise("TRUE" = sum(cond == TRUE) / n(),
"FALSE" = sum(cond == FALSE) / n())
#> # A tibble: 2 × 3
#>   cat   `TRUE` `FALSE`
#>   <chr>  <dbl>   <dbl>
#> 1 A       0.75    0.25
#> 2 B       0.5     0.5

创建于2022-02-24由reprex包(v2.0.1(

这里有另一种方法:如果你想得到每个CATTRUEFALSE的比例,那么这应该有效:

library(dplyr)
library(tidyr)
df %>% 
group_by(CAT, CONDITION) %>% 
tally() %>% 
mutate(n = (n/sum(n))) %>% 
pivot_wider(
id_cols = CAT,
names_from = CONDITION,
values_from = n
)
CAT   `FALSE` `TRUE`
<chr>   <dbl>  <dbl>
1 A        0.25   0.75
2 B        0.25   0.75

最新更新