下面的代码运行良好,但我想在SPV
之后创建另一个名为JOV
的变量。该变量将具有如下条件:
如果我有CCD_ 3";,group_cols
中的"Week"
和"DTT"
,do:
SPV %>% filter(date2 == dmda, Category == CategoryChosse, DTT==DTest)
如果我有";group_cols
中的Category"
和"Week"
,do:
SPV %>% filter(date2 == dmda, Category == CategoryChosse)
如果我在group_cols
中只有"Category"
,请执行:
SPV %>% filter(date2 == dmda)
下方的可执行代码
library(dplyr)
library(tidyverse)
library(lubridate)
df1 <- structure(
list(date1= c("2021-06-28","2021-06-28","2021-06-28","2021-06-28"),
date2 = c("2021-06-23","2021-06-24","2021-06-30","2021-07-01"),
DTT= c("Hol","Hol","Hol",0),
Week= c("Wednesday","Thursday","Wednesday","Thursday"),
Category = c("ABC","FDE","ABC","FDE"),
DR1 = c(4,1,1,2),
DR01 = c(4,1,2,3), DR02= c(4,2,0,2),DR03= c(9,5,0,1),
DR04 = c(5,4,3,2),DR05 = c(5,4,0,2)),
class = "data.frame", row.names = c(NA, -4L))
dmda<-"2021-07-01"
CategoryChosse<-"FDE"
DTest<-"Hol"
Wk<-"Thursday"
Dx<-subset(df1,df1$date2<df1$date1)
x<-Dx %>% select(starts_with("DR0"))
x<-cbind(Dx, setNames(Dx$DR1 - x, paste0(names(x), "_PV")))
PV<-select(x, date2,Week, Category, DTT, DR1, ends_with("PV"))
group_cols <-
if (any(PV$DTT == DTest & PV$Week == Wk, na.rm = TRUE)) {
c("Category", "Week", "DTT")
} else if (any(PV$Week == Wk & PV$Category == CategoryChosse & PV$DTT != DTest, na.rm=TRUE)) {
c("Category", "Week")
} else {
"Week"
}
med <- PV %>%
group_by(across(all_of(group_cols))) %>%
summarize(across(ends_with("PV"), median),.groups = 'drop')
SPV <- df1 %>%
inner_join(med, by = group_cols) %>%
mutate(across(matches("^DR0\d+$"), ~.x +
get(paste0(cur_column(), '_PV')),
.names = '{col}_{col}_PV')) %>%
select(date1:Category, DR01_DR01_PV:last_col())
尝试:
SPV %>%
filter(
date2 == dmda,
!("Category" %in% group_cols) | Category == CategoryChosse,
!all(c("Category", "DTT") %in% group_cols) | DTT == DTest
)
这是你的条件的直译。然而,如果我读对了,它可以用简化一点
SPV %>%
filter(
date2 == dmda,
!("Category" %in% group_cols) | Category == CategoryChosse,
!("DTT" %in% group_cols) | DTT == DTest
)
如果您曾经设想在group_cols
中允许"DTT"
而不允许"Category"
。(即使这种情况永远不会发生,也能起作用。(
您可以将不同的案例存储在列表中,并根据需要提取元素。
- 更改此函数以返回向量和列表索引
filter_condition <-
if (any(PV$DTT == DTest & PV$Week == Wk, na.rm = TRUE)) {
1:3 # if you had options out of order, you could have something like c(1, 3)
} else if (any(PV$Week == Wk & PV$Category == CategoryChosse & PV$DTT != DTest, na.rm=TRUE)) {
1:2
} else {
1
}
- 创建用于分组和筛选的向量和列表
group_cols <- c("Week", "Category", "DTT")
filter_opts <- rlang::exprs(date2 == dmda,
Category == CategoryChosse,
DTT == DTest)
- 根据您所处的情况更改
group_by()
以从变量中提取
med <- PV %>%
group_by(across(all_of(group_cols[filter_condition]))) %>%
summarize(across(ends_with("PV"), median),.groups = 'drop')
- 与以前相同的代码
inner_join(med, by = group_cols) %>%
mutate(across(matches("^DR0\d+$"), ~.x +
get(paste0(cur_column(), '_PV')),
.names = '{col}_{col}_PV')) %>%
select(date1:Category, DR01_DR01_PV:last_col())
- 根据您所处的情况应用筛选条件
SPV %>%
filter(!!!filter_opts[filter_condition])
正如在另一个答案中所提到的,您可以缩短它,因为似乎date2
总是可取的。但这为类似的事情制定了一个框架。