r-从几个参数的选择中只传递一个参数到函数,并在dplyr中有条件地管道



我正在寻找一种方法,可以有条件地只向函数传递一个参数(三个选项之一(。根据选择,我想简单地在数据集中创建一个变量。假设我们有以下数据集:

set.seed(10)
test <- data.frame(time_stamp = sample(seq(as.Date('1999/01/01'), as.Date('2012/01/01'), by="day"), 12))
test
#    time_stamp
# 1  2000-05-05
# 2  2009-03-09
# 3  2008-04-24
# 4  2011-03-22
# 5  2003-05-27
# 6  2003-01-01
# 7  2008-10-22
# 8  2003-10-13
# 9  2011-02-26
# 10 2008-08-27
# 11 2011-12-30
# 12 2001-07-18

当我运行我的函数时,我想要的输出如下:

test_fun(type = "halfs") 
#or more simply
test_fun(halfs)
#    time_stamp half_var
# 1  2000-05-05  H1 2000
# 2  2009-03-09  H1 2009
# 3  2008-04-24  H1 2008
# 4  2011-03-22  H1 2011
# 5  2003-05-27  H1 2003
# 6  2003-01-01  H1 2003
# 7  2008-10-22  H2 2008
# 8  2003-10-13  H2 2003
# 9  2011-02-26  H1 2011
# 10 2008-08-27  H2 2008
# 11 2011-12-30  H2 2011
# 12 2001-07-18  H2 2001

根据所选的参数,我在管道中运行了一个if语句,我认为如果我像这里提到的那样在条件语句周围加上{},我可以做到这一点,但我无法理解。功能如下:

test_fun <- function(type = c("halfs", "quarts", "other")) {
test %>% {
if (type == "halfs") {
mutate(half_var = ifelse(month(time_stamp) <= 6, paste('H1', year(time_stamp)), paste('H2', year(time_stamp))))
}  else if (type == "quarts") {
mutate(quarts_var = case_when(month(time_stamp) <= 3 ~ paste('q1', year(time_stamp)), 
month(time_stamp) > 3 & month(time_stamp) <= 6 ~ paste('q2', year(time_stamp)),
month(time_stamp) > 6 & month(time_stamp) <= 9 ~ paste('q3', year(time_stamp)),
month(time_stamp) > 9 ~ paste('q4', year(time_stamp))))
}  else (type == "other") {
mutate(other = ifelse(month(time_stamp) <= 6, paste('H1', year(time_stamp)), paste('H2', year(time_stamp))))
}
}
}

我收到了一个关于意外括号的错误,但我认为问题在于管道内的条件if(所有括号都关闭(。

另一种方法可能是使用此处建议的可选参数test_fun <- function(halfs, quarts = NULL, other = NULL)),但这种方式表明必须提供halfs,但事实并非如此。真的,我想要test_fun <- function(halfs = NULL, quarts = NULL, other = NULL))test_fun <- function(...))之类的东西,但这是做不到的。解决这个问题的方法可能是提供数据作为参数:test_fun <- function(test, halfs = NULL, quarts = NULL, other = NULL)),但我搞不清楚。

任何建议都很好。

语法错误是真实的,必须首先解决。else (type == "other")不是正确的语法。我想你指的是else if (type == "other")。由于您没有if,括号是出乎意料的。

但是,当您通过管道进入代码块时,也需要使用.来放置变量。你在{}内的突变应该使用mutate(., half_var=...)

test_fun <- function(type = c("halfs", "quarts", "other")) {
test %>% {
if (type == "halfs") {
mutate(., half_var = ifelse(month(time_stamp) <= 6, paste('H1', year(time_stamp)), paste('H2', year(time_stamp))))
}  else if (type == "quarts") {
mutate(., quarts_var = case_when(month(time_stamp) <= 3 ~ paste('q1', year(time_stamp)), 
month(time_stamp) > 3 & month(time_stamp) <= 6 ~ paste('q2', year(time_stamp)),
month(time_stamp) > 6 & month(time_stamp) <= 9 ~ paste('q3', year(time_stamp)),
month(time_stamp) > 9 ~ paste('q4', year(time_stamp))))
}  else if (type == "other") {
mutate(., other = ifelse(month(time_stamp) <= 6, paste('H1', year(time_stamp)), paste('H2', year(time_stamp))))
}
} 
}

这些计算已经在zoo包中的yearmonyearqtr中直接可用,因此:

library(zoo)
test %>% 
mutate(yearmon = as.yearmon(time_stamp),
yearqtr = as.yearqtr(time_stamp),
yearhalf = paste0(as.integer(yearmon), " H", (cycle(yearmon) > 6) + 1))

给予:

time_stamp  yearmon yearqtr yearhalf
1  2005-08-07 Aug 2005 2005 Q3  2005 H2
2  2002-12-27 Dec 2002 2002 Q4  2002 H2
3  2004-07-19 Jul 2004 2004 Q3  2004 H2
4  2008-01-03 Jan 2008 2008 Q1  2008 H1
5  2000-02-08 Feb 2000 2000 Q1  2000 H1
6  2001-12-05 Dec 2001 2001 Q4  2001 H2
7  2002-07-26 Jul 2002 2002 Q3  2002 H2
8  2002-07-15 Jul 2002 2002 Q3  2002 H2
9  2006-12-29 Dec 2006 2006 Q4  2006 H2
10 2004-07-29 Jul 2004 2004 Q3  2004 H2
11 2007-06-16 Jun 2007 2007 Q2  2007 H1
12 2006-05-13 May 2006 2006 Q2  2006 H1

功能

目前还不清楚我们是否真的需要一个功能,但只是为了完成这个:

test_fun <- function(x, type = c("month", "quarter", "half")) {
type <- match.arg(type)
ym <- as.yearmon(x)
if (type == "month") ym
else if (type == "quarter") as.yearqtr(x)
else paste0(as.integer(ym), " H", (cycle(ym) > 6) + 1)
}
library(zoo)
test %>% 
mutate(yearmonth = test_fun(time_stamp, "month"),
yearqtr = test_fun(time_stamp, "quarter"),
yearhalf = test_fun(time_stamp, "half"))

带有一个参数的函数

关于这个问题的主题行,它要求一个自变量的函数,我不太确定这是一个好主意,因为它意味着要对使用哪一列进行硬编码,但如果你真的想这样做,我们在下面展示如何做到这一点。实际上,我们提供了第二个参数,以防您改变主意并想要指定time_stamp列,但如果没有指定它,它将适当地默认为在mutate中调用它。

test_fun2 <- function(type = c("month", "quarter", "half"),
x = parent.frame()$.data$time_stamp) {
type <- match.arg(type)
ym <- as.yearmon(x)
if (type == "month") ym
else if (type == "quarter") as.yearqtr(x)
else paste0(as.integer(ym), " H", (cycle(ym) > 6) + 1)
}
library(zoo)
test %>% 
mutate(month = test_fun2("month"),
quarter = test_fun2("quarter"),
halfs = test_fun2("half"))

函数,返回三个子集

如果您的意思是希望test_fun3最多返回3列,那么

test_fun3 <- function(x, month = FALSE, quarter = FALSE, half = FALSE) {
ym <- as.yearmon(x)
data <- data.frame(yearmon = ym,
quarter = as.yearqtr(x),
half = paste0(as.integer(ym), " H", (cycle(ym) > 6) + 1))
data[c(month, quarter, half)]
}
test %>% 
bind_cols(test_fun3(.$time_stamp, TRUE, TRUE))

最新更新