我的目标是使用tidyverse有条件地变异数据类型。这是一个可重复的例子。例如,我想将列cyl
更改为一个因子。然而,因子的levels
和labels
参数将取决于用户是提供了对象bin.order
还是将其保留为NULL
。我知道如何在tidyverse之外做到这一点,但正在寻找一种更简洁的方式来实现tidyverses函数。
mtcars %>%
mutate(cyl = ifelse(is.null(bin.order),
factor(x = cyl, levels = sort(unique(cyl)), labels = sort(unique(cyl))),
factor(x = cyl, levels = bin.order, labels = bin.order)))
预期的结果是这样的:
# if bin.order is null
mtcars %>%
mutate(cyl = factor(x = cyl, levels = sort(unique(cyl)), labels = sort(unique(cyl))))
# if bin.order is not null
bin.order = c(4, 6, 8)
mtcars %>%
mutate(cyl = factor(x = cyl, levels = bin.order, labels = bin.order))
您可以使用%||%
运算符(来自rlang,由purrr重新导出),如果不是NULL
,则使用左侧,否则使用右侧。即x %||% y
等同于if (is.null(x)) y else x
。
对于您的案例:
library(dplyr)
library(purrr)
factor.bin.order <- function(x, bin.order = NULL) {
factor(x, bin.order %||% sort(unique(x)))
}
mtcars2 <- mtcars %>%
mutate(
cyl1 = factor.bin.order(cyl),
cyl2 = factor.bin.order(cyl, c(6, 4, 8))
)
levels(mtcars2$cyl1)
# "4" "6" "8"
levels(mtcars2$cyl2)
# "6" "4" "8"
还要注意,如果它们与levels
相同,则无需指定labels
,因为这是默认行为。
我可能的解决方案是构建一个函数
fct_if <- function(x,bin.order = NULL){
if(is.null(bin.order)){
output <- factor(x = x, levels = sort(unique(x)), labels = sort(unique(x)))
}else{
output <- factor(x = x, levels = bin.order, labels = bin.order)
}
return(output)
}
mtcars %>%
mutate(cyl = fct_if(cyl))
mtcars %>%
mutate(cyl = fct_if(cyl,bin.order = c(4, 6, 8)))