按参考组划分不同的组

  • 本文关键字:划分 参考 r
  • 更新时间 :
  • 英文 :


我在这里遇到了一个几乎相同的问题:按参考组划分不同的组

我有这个df,只有更多的分组变量(没有结果列):

df <- data.frame(pop= c(1,1,1,1,1,1,1,1,2,2,2,2,3,3,3,3),
state= c(NJ,NJ,NJ,VT,VT,VT,VT,DC,DC,DC,DC,IL,IL,IL,IL),
start_dt= c(2010-01-01,2010-01-01,2010-01-01,2010-01-01,2010-01-02,2010-01-02,2010-01-02,2010-01-02,2010-02-03,2010-02-03,2010-02-03,2010-02-03,2010-03-05,2010-03-05,2010-03-05,2010-03-05),
end_dt= c(2011-01-01,2011-01-01,2011-01-01,2011-01-01,2011-01-02,2011-01-02,2011-01-02,2011-01-02,2011-02-03,2011-02-03,2011-02-03,2011-02-03,2011-03-05,2011-03-05,2011-03-05,2011-03-05),
value = c(12,7,6,9,15,7,6,9,18,5,6,3,20,5,5,6),
group = c("denominator", "Treated1", "Treated2", "Treated3","denominator", "Treated1", "Treated2", "Treated3","denominator", "Treated1", "Treated2", "Treated3","denominator", "Treated1", "Treated2", "Treated3"),
result = c(1,0.58,0.5,0.75,1,0.46...))

我还想按pop(population)、state、start_dt、end_dt和group对数据进行分组,然后用上面相同分组的分母除以group的每个子组,以获得结果列,我尝试了接受的答案,并做了如下操作:

df <- df %>% 
group_by(pop,state,start_dt,end_dt) %>% 
mutate(result=value/value[group == "denominator"])
library(dplyr)
df <- df %>%
group_by(pop,state,start_dt,end_dt) %>%
summarize(result = value[group != "denominator"] / value[group == "denominator"])

But I got error:

group_by: 4 grouping variables (pop, state, start_dt, end_dt)
Error in `.fun()`:
! Problem while computing `result=value/value[group == "denominator"]`.
x `result` must be size 1, not 0.
i The error occurred in group 99: pop = "1", group = "Treated2", state =
"DC", start_dt = 2010-01-01, end_dt = 2011-02-01.
Backtrace:
1. ... %>% ...
2. tidylog::mutate(., result=value/value[group == "denominator"])
3. tidylog:::log_mutate(...)
5. dplyr:::mutate.data.frame(.data, ...)

任何想法?

问题是至少有一个组没有denominator。我们可以用[将第一个元素子集化,并将其强制化为NA

library(dplyr)
df %>%
group_by(pop,state,start_dt,end_dt) %>%
summarize(result = value[group != "denominator"] / 
value[group == "denominator"][1],
group = group[group != "denominator"], .groups = "drop")

最新更新