r语言 - 按一个日期聚合两个变量



我正在使用以下数据集:

day descent_cd
<int> <chr>     
1    26 B         
2    19 W         
3    19 B         
4    16 B         
5     1 W         
6     2 W         
7     2 B         
8     2 B         
9     3 W         
10     3 W         
# … with 1,283 more rows

简而言之:"天"变量是月份中的某一天。"Descent_cd"是种族(黑人或白人(。

我正在尝试组织它,以便我得到一列"B"列和一列"W"列,两者都按当天的总逮捕排序......意思是:计算"1"天的所有"B"和"W"的相同值,然后依此类推,直到该月的其余时间。

我最终想把它作为一个geom_ridge图来做。

这是你要找的吗?

library(tidyverse)
#sample data
df <- tibble::tribble(
~day, ~descent_cd,
26L,      "B",
19L,         "W",
19L,         "B",
16L,         "B",
1L,         "W",
2L,         "W",
2L,         "B",
2L,         "B",
3L,         "W",
3L,         "W"
)
df %>% 
group_by(day, descent_cd) %>% 
summarise(total_arrest = n()) %>% #calculate number of arrests per day per descent_cd
pivot_wider(names_from = descent_cd, values_from = total_arrest) %>% #create columns W and B
mutate(W = if_else(is.na(W),as.integer(0),W), #replace NAs with 0 (meaning 0 arrests that day)
B = if_else(is.na(B),as.integer(0),B)) %>% 
arrange(desc(wt = W+B)) #arrange df in descending order of total arrests per day
# A tibble: 6 x 3
# Groups:   day [6]
day     W     B
<int> <int> <int>
1     2     1     2
2     3     2     0
3    19     1     1
4     1     1     0
5    16     0     1
6    26     0     1

最新更新