在R中说我有数据帧:
frame object positive
1 6 0
2 6 1
3 6 1
4 6 1
5 6 1
6 6 0
7 6 0
8 6 1
9 6 1
10 6 1
1 7 1
2 7 1
3 7 1
4 7 1
5 7 1
6 7 0
7 7 1
8 7 0
9 7 1
10 7 1
我正在尝试创建一个新的表,该表统计每个单独对象的正列中值为1的连续出现次数,并输出最大和平均连续出现次数。看起来像:
object max mean
6 4 3.5
7 5 8/3
谢谢你的帮助!
这里有一个使用data.table::rleid
查找连续出现的1s的解决方案。
library("tidyverse")
df <- tibble::tribble(
~frame, ~object, ~positive,
1L, 6L, 0L,
2L, 6L, 1L,
3L, 6L, 1L,
4L, 6L, 1L,
5L, 6L, 1L,
6L, 6L, 0L,
7L, 6L, 0L,
8L, 6L, 1L,
9L, 6L, 1L,
10L, 6L, 1L,
1L, 7L, 1L,
2L, 7L, 1L,
3L, 7L, 1L,
4L, 7L, 1L,
5L, 7L, 1L,
6L, 7L, 0L,
7L, 7L, 1L,
8L, 7L, 0L,
9L, 7L, 1L,
10L, 7L, 1L
)
df %>%
group_by(object) %>%
mutate(
sequence = data.table::rleid(positive == 1),
) %>%
filter(
positive == 1
) %>%
group_by(
object, sequence
) %>%
summarise(
length = n()
) %>%
summarise(
max = max(length),
mean = mean(length)
)
#> `summarise()` has grouped output by 'object'. You can override using the
#> `.groups` argument.
#> # A tibble: 2 × 3
#> object max mean
#> <int> <int> <dbl>
#> 1 6 4 3.5
#> 2 7 5 2.67
创建于2022-07-26由reprex包(v2.0.1(
我创建了自己的数据,所以输出不会完全是您显示的那样。尽管如此,它还是应该奏效。
library(dplyr)
sat.seed(111)
df <- data.frame(frame=c(1:10,1:10),
object=rep(6:7, each=10),
positive=sample(0:1,20, replace=T))
df
frame object positive
1 1 6 1
2 2 6 1
3 3 6 1
4 4 6 0
5 5 6 1
6 6 6 0
7 7 6 0
8 8 6 0
9 9 6 1
10 10 6 1
11 1 7 1
12 2 7 0
13 3 7 1
14 4 7 0
15 5 7 0
16 6 7 1
17 7 7 0
18 8 7 0
19 9 7 0
20 10 7 1
df %>% group_by(object) %>% summarise(mean=mean(rle(positive)$lengths[rle(positive)$values==1]) ,
max=max(rle(positive)$lengths[rle(positive)$values==1]))
# A tibble: 2 × 3
object mean max
<int> <dbl> <int>
1 6 2 3
2 7 1 1