如何在R(数据帧)中创建每个名称每年出现一个值的频率变量



我正在努力在我的数据帧中创建一个新的变量/列,其中包含每个名称每年wi=1的频率。

这是一个测试数据帧df = tibble::as_tibble(data.frame(Name=c("x","x","x","x","x", "y","y","y","y","y","y"), Year=c(2011,2011,2011,2012,2012,2011,2011,2012,2012,2012,2012), id=c(8,23, 1,5,7,25,83,6,2,9,10), wi =c(1,0,1,1,0,1,1,0,0,1,0)))

# A tibble: 11 × 4
Name   Year    id    wi
<chr> <dbl> <dbl> <dbl>
1 x      2011     8     1
2 x      2011    23     0
3 x      2011     1     1
4 x      2012     5     1
5 x      2012     7     0
6 y      2011    25     1
7 y      2011    83     1
8 y      2012     6     0
9 y      2012     2     0
10 y      2012     9     1
11 y      2012    10     0

理想情况下,数据帧最终会是这样的:


df
# A tibble: 11 × 5
Name   Year    id    wi freq_wi
<chr> <dbl> <dbl> <dbl>   <dbl>
1 x      2011     8     1    0.66
2 x      2011    23     0    0.66
3 x      2011     1     1    0.66
4 x      2012     5     1    0.5 
5 x      2012     7     0    0.5 
6 y      2011    25     1    1   
7 y      2011    83     1    1   
8 y      2012     6     0    0.25
9 y      2012     2     0    0.25
10 y      2012     9     1    0.25
11 y      2012    10     0    0.25

感谢所有的帮助!!

这里是另一个dplyr解决方案:

library(dplyr)
df %>% 
group_by(Name, Year) %>% 
mutate(Count = ifelse(wi == 1, sum(wi), sum(wi)),
req_wi = Count/sum(Count)*Count) %>% 
ungroup() %>% 
select(-Count)
Name   Year    id    wi req_wi
<chr> <dbl> <dbl> <dbl>  <dbl>
1 x      2011     8     1  0.667
2 x      2011    23     0  0.667
3 x      2011     1     1  0.667
4 x      2012     5     1  0.5  
5 x      2012     7     0  0.5  
6 y      2011    25     1  1    
7 y      2011    83     1  1    
8 y      2012     6     0  0.25 
9 y      2012     2     0  0.25 
10 y      2012     9     1  0.25 
11 y      2012    10     0  0.25 

如果wi始终为0或1,则可以执行以下操作(因为mean(wi(等于"wi〃的频率(

library(dplyr)
df %>% 
group_by(Name, Year) %>% 
summarise(freq_wi=mean(wi)) %>% 
left_join(df, .)
Name   Year    id    wi freq_wi
<fct> <dbl> <dbl> <dbl>   <dbl>
1 x      2011     8     1   0.667
2 x      2011    23     0   0.667
3 x      2011     1     1   0.667
4 x      2012     5     1   0.5  
5 x      2012     7     0   0.5  
6 y      2011    25     1   1    
7 y      2011    83     1   1    
8 y      2012     6     0   0.25 
9 y      2012     2     0   0.25 
10 y      2012     9     1   0.25 
11 y      2012    10     0   0.25 

最新更新