126
345112
我有一个DF,看起来像下面的crash_stats_TA。
使用scales::percent
,
crash %>%
mutate(crashes_perc = scales::percent(Crashes/sum(Crashes, na.rm = T)))
TA_code TA_name Crashes crashes_perc
<int> <chr> <int> <chr>
1 61 Grey 126 20.8%
2 62 Buller 345 56.8%
3 63 Westland 24 4.0%
4 64 Timaru 112 18.5%
每组样本量足够时加入group_by
。
100%的原因是因为每组只有'崩溃'值,sum
返回相同的值。相反,它应该没有任何分组
library(dplyr)
crash_stats_TA %>%
mutate(crashes_perc = round(Crashes/sum(Crashes, na.rm = TRUE)*100,2))
与产出
TA_code TA_name Crashes crashes_perc
1 61 Grey 126 20.76
2 62 Buller 345 56.84
3 63 Westland 24 3.95
4 64 Timaru 112 18.45
在base R
中,使用proportions
crash_stats_TA$crashes_perc <- with(crash_stats_TA, round(100 *
proportions(Crashes), 2))
数据crash_stats_TA <- structure(list(TA_code = 61:64, TA_name = c("Grey", "Buller",
"Westland", "Timaru"), Crashes = c(126L, 345L, 24L, 112L)),
class = "data.frame", row.names = c(NA,
-4L))