r语言 - 如何按年改变两个种群的比率


overseas_domestic_indicator ref_year count
<chr>                          <dbl> <dbl>
1 Domestic                        2014 17854
2 Domestic                        2015 18371
3 Domestic                        2016 18975
4 Domestic                        2017 19455
5 Domestic                        2018 19819
6 Overseas                        2014  6491
7 Overseas                        2015  7393
8 Overseas                        2016  8594
9 Overseas                        2017  9539
10 Overseas                        2018 10455

这是我的数据。我想要这样的东西:

ref_year Domestic/Overseas
2014    2.75
2015    ...
...     ...

但是我不知道如何使用整洁来做到这一点。我尝试使用突变,但我不知道如何澄清国内和海外的计数。提前谢谢。

您可以先获取宽格式的数据,然后将Domestic除以Overseas

library(dplyr)
df %>%
tidyr::pivot_wider(names_from = overseas_domestic_indicator, 
values_from = count) %>%
mutate(ratio = Domestic/Overseas)

#  ref_year Domestic Overseas ratio
#     <int>    <int>    <int> <dbl>
#1     2014    17854     6491  2.75
#2     2015    18371     7393  2.48
#3     2016    18975     8594  2.21
#4     2017    19455     9539  2.04
#5     2018    19819    10455  1.90

我们可以按"ref_year"做一个组,通过将"国内"对应的"计数"与"海外"对应的"计数"除以summarise,并根据需要重新塑造为"宽">

library(dplyr)
library(tidyr)
df1 %>%
group_by(ref_year) %>%
summarise(
`Domestic/Overseas` = count[overseas_domestic_indicator == 'Domestic']/
count[overseas_domestic_indicator == 'Overseas']) 
# A tibble: 5 x 2
#  ref_year `Domestic/Overseas`
#     <int>               <dbl>
#1     2014                2.75
#2     2015                2.48
#3     2016                2.21
#4     2017                2.04
#5     2018                1.90

或者先arrange再做一个除法

df1 %>%
arrange(ref_year, overseas_domestic_indicator) %>% 
group_by(ref_year) %>% 
summarise( `Domestic/Overseas` = first(count)/last(count))

或者来自data.tabledcast

library(data.table)
dcast(setDT(df1), ref_year ~ overseas_domestic_indicator)[, 
`Domestic/Overseas` := Domestic/Overseas][]

数据

df1 <- structure(list(overseas_domestic_indicator = c("Domestic", "Domestic", 
"Domestic", "Domestic", "Domestic", "Overseas", "Overseas", "Overseas", 
"Overseas", "Overseas"), ref_year = c(2014L, 2015L, 2016L, 2017L, 
2018L, 2014L, 2015L, 2016L, 2017L, 2018L), count = c(17854L, 
18371L, 18975L, 19455L, 19819L, 6491L, 7393L, 8594L, 9539L, 10455L
)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", 
"6", "7", "8", "9", "10"))

这也应该有效,有多种方法可以做到这一点

df %>%
pivot_wider(overseas_domestic_indicator, 
names_from = overseas_domestic_indicator, 
values_from = count) %>%
mutate(Ratio = Domestic/Overseas)

最新更新