r中组成员的年变化百分比



我想在R.中按组查看组成员的流失/增长水平

我的数据:

year1 <- 
tibble(people = c("Joe A", "Max X", "Sam M",  "Jane K", "Doug K"), group = c(1, 1, 1, 2, 2))
year1 <- 
tibble(people = c("Joe A", "Sam M",  "Jane K", "Doug K", "Mike K", "Jen G", "Mohamad T"), group = c(1, 1, 1, 2, 2, 2, 2))
  • 第一组失去了Max,但获得了从第二组移出的Jane
  • 第二组失去了简,但获得了穆罕默德

有没有办法了解每年有多少人加入/离开一个团队,以及每年的百分比变化?

也许有更容易的选择,但你可以这样做:

year1 <- tibble(people = c("Joe A", "Max X", "Sam M", "Jane K", "Doug K"), group = c(1, 1, 1, 2, 2))
year2 <- tibble(people = c("Joe A", "Sam M", "Jane K", "Doug K", "Mike K", "Jen G", "Mohamad T"), group = c(1, 1, 1, 2, 2, 2, 2))
library(tidyverse)    
map(.x = unique(year1$group),
.f = ~ year1 |> 
filter(group == .x) |> 
mutate(year = 1) |> 
bind_rows(year2 |> 
filter(group == .x) |> 
mutate(year = 2)) |> 
summarize(group = unique(group),
joined     = length(setdiff(people[year == 2], people[year == 1])),
left       = length(setdiff(people[year == 1], people[year == 2])),
n_year1    = sum(year == 1),
n_year2    = sum(year == 2),
pct_change = n_year1 / n_year2)) |> 
bind_rows()
# A tibble: 2 × 6
group joined  left n_year1 n_year2 pct_change
<dbl>  <int> <int>   <int>   <int>      <dbl>
1     1      1     1       3       3        1  
2     2      3     1       2       4        0.5

根据一些假设对代码进行了一些更改:

year1 <- tibble(people = c("Joe A", "Max X", "Sam M", "Jane K", "Doug K"), group = c(1, 1, 1, 2, 2), year = 1)
year2 <- tibble(people = c("Joe A", "Sam M", "Jane K", "Doug K", "Mike K", "Jen G", "Mohamad T"), group = c(1, 1, 1, 2, 2, 2, 2), year = 2)
years =  year1 %>% bind_rows(year2)
years %>% group_by(group, year) %>% summarise(n = n()) %>% group_by(group) %>% mutate(pct_change = n/lag(n) - 1)

我假设您的第二个数据帧代表另一年,然后将两者绑定到一个数据帧中,并使用一个year列来标识它所代表的年份。

输出:

group  year     n   pct_change
<dbl> <dbl> <int> <dbl>
1     1     1     3    NA
2     1     2     3     0
3     2     1     2    NA
4     2     2     4     1
library(tidyverse)
year1 <- tibble(people = c("Joe A", "Max X", "Sam M", "Jane K", "Doug K"), group = c(1, 1, 1, 2, 2))
year2 <-tibble(people = c("Joe A", "Sam M",  "Jane K","Doug K","Mike K", "Jen G", "Mohamad T"), group = c(1, 1, 1, 2, 2, 2, 2))
year1 %>%
group_by(group) %>%
summarise(n = n()) %>%
full_join(year2 %>%
group_by(group) %>%
summarise(n = n()), by = "group") %>%
mutate(change = n.y - n.x, percent_change = change / n.x) %>%
ungroup() %>%
select(group, n.x, n.y, change, percent_change) %>% print()

输出:(n.y=年份2,n.x=年份1(

# A tibble: 2 x 5
group   n.x   n.y change percent_change
<dbl> <int> <int>  <int>          <dbl>
1     1     3     3      0              0
2     2     2     4      2              1

相关内容

  • 没有找到相关文章

最新更新