r语言 - 按组划分的频率 - r - Frequency by group 小贝子编程网

我有一个数据框，其中包含以下变量 day(工作日从 1-7(和时间变量 t1 到 t7，用于记录在特定时间内执行的活动。

我想确定每个同源时间段在 7 个工作日内发生相同活动的次数。

输入：

day t1 t2 t3 t4 t5 t6 t7
1  1  0  1  0  0  0  1
1  1  0  1  0  4  0  1
4  2  3  1  0  1  0  1
5  1  1  1  0  0  0  1

输出：

time   Most frequent
t1     1    
t2     0,1,3       
t3     1
t4     0
t5     0
t6     0
t7     1

这是一个dplyr解决方案：

df %>% 
pivot_longer(-day) %>% 
group_by(name,value) %>% 
distinct() %>% 
mutate(freq = n()) %>% 
group_by(name) %>% 
filter(freq == max(freq)) %>% 
select(name, value) %>% 
distinct() %>% 
group_by(name) %>% 
summarise(`Most frequent` = paste(value, collapse = ",")) %>% 
rename(time = name)

这给了：

time  `Most frequent`
<chr> <chr>          
1 t1    1              
2 t2    0,3,1          
3 t3    1              
4 t4    0              
5 t5    0              
6 t6    0              
7 t7    1

以下是带有一些注释的代码：

df %>% 
pivot_longer(-day) %>% # Structuring data in long format
group_by(name,value) %>% # Grouping by name(t#) and value(activity)
distinct() %>%  # Selecting distinct instances of time + activity (i.e. day + t#)
mutate(freq = n()) %>% # Counting unique occurances of time + activity
group_by(name) %>% # Grouping by time
filter(freq == max(freq)) %>% # Filtering to select only the most frequent cases
select(name, value) %>% # Selecting only the variables name and value
distinct() %>% # Filtering for unique occurances
group_by(name) %>% # Grouping by name (time)
summarise(`Most frequent` = paste(value, collapse = ",")) %>% # Aggregating by time, pasting values on separate rows together with a comma separating the values
rename(time = name) # Renaming variable name to time

r语言 - 按组划分的频率

相关内容

最新更新

热门标签：