根据状态计算级别的平均值

  • 本文关键字:平均值 计算 状态 r
  • 更新时间 :
  • 英文 :


计算我的数据的平均值看起来像

    -----------
    level   sts
    -----------
    10      s
    -----------
    11      s
    -----------
    10      s
    -----------
    10      s
    -----------
    10      s
    -----------
    9       r
    -----------
    8.5     r
    -----------
    8       s
    -----------
    8.1     s
    -----------
    8       s
    -----------

根据sts计算平均值(s = stop,r =运行)。我想输出像这样

    -----------
    level   sts
    -----------
    10.2     s
    -----------
    9        r
    -----------
    8.5      r
    -----------
    8.03     s
    -----------

最后,输出看起来像

    -----------
    level   sts
    -----------
    10.2    s
    -----------
    10.2    s
    -----------
    10.2    s
    -----------
    10.2    s
    -----------
    10.2    s
    -----------
    9       r
    -----------
    8.5     r
    -----------
    8.03    s
    -----------
    8.03    s
    -----------
    8.03    s
    ---------

如果答案已经可用,请给我链接谢谢

基于您所需的输出,我会尝试以下操作:

library(data.table)
setDT(mydf)[, group := rleid(sts)][
  sts == "s", level := mean(level), .(sts, group)][]
#         level sts group
#  1: 10.200000   s     1
#  2: 10.200000   s     1
#  3: 10.200000   s     1
#  4: 10.200000   s     1
#  5: 10.200000   s     1
#  6:  9.000000   r     2
#  7:  8.500000   r     2
#  8:  8.033333   s     3
#  9:  8.033333   s     3
# 10:  8.033333   s     3

我想在" tidyverse"中,等效物应该是:

library(tidyverse)
library(data.table) # for `rleid`
mydf %>%
  mutate(group = rleid(sts)) %>%
  group_by(sts, group) %>%
  mutate(level = case_when(
    sts == "s" ~ mean(level),
    TRUE ~ level
  ))

样本数据:

mydf <- structure(list(level = c(10, 11, 10, 10, 10, 9, 8.5, 8, 8.1, 
    8), sts = c("s", "s", "s", "s", "s", "r", "r", "s", "s", "s")),
    .Names = c("level", "sts"), row.names = c(NA, 10L), class = "data.frame")

最新更新