我正在使用以下数据集:
day descent_cd
<int> <chr>
1 26 B
2 19 W
3 19 B
4 16 B
5 1 W
6 2 W
7 2 B
8 2 B
9 3 W
10 3 W
# … with 1,283 more rows
简而言之:"天"变量是月份中的某一天。"Descent_cd"是种族(黑人或白人(。
我正在尝试组织它,以便我得到一列"B"列和一列"W"列,两者都按当天的总逮捕排序......意思是:计算"1"天的所有"B"和"W"的相同值,然后依此类推,直到该月的其余时间。
我最终想把它作为一个geom_ridge图来做。
这是你要找的吗?
library(tidyverse)
#sample data
df <- tibble::tribble(
~day, ~descent_cd,
26L, "B",
19L, "W",
19L, "B",
16L, "B",
1L, "W",
2L, "W",
2L, "B",
2L, "B",
3L, "W",
3L, "W"
)
df %>%
group_by(day, descent_cd) %>%
summarise(total_arrest = n()) %>% #calculate number of arrests per day per descent_cd
pivot_wider(names_from = descent_cd, values_from = total_arrest) %>% #create columns W and B
mutate(W = if_else(is.na(W),as.integer(0),W), #replace NAs with 0 (meaning 0 arrests that day)
B = if_else(is.na(B),as.integer(0),B)) %>%
arrange(desc(wt = W+B)) #arrange df in descending order of total arrests per day
# A tibble: 6 x 3
# Groups: day [6]
day W B
<int> <int> <int>
1 2 1 2
2 3 2 0
3 19 1 1
4 1 1 0
5 16 0 1
6 26 0 1