dplyr::case_when()
中一个非常简单的求值在R版本4.1.2下的dplyr_1.0.8中返回一条奇怪的错误消息。我已经隔离了这段代码中的行为,如果出现以下两种边缘情况之一,我将尝试调整durationI
变量的值:
library(tidyverse)
# Create simple example data
raw <- tribble(
~activity_ID, ~durationI, ~distanceI, ~tmode,
1, 190, 57, "auto",
2, 23, 41, NA,
3, 91, 58, "rail"
)
# Now trip it up
update <- mutate(raw,
distanceI = ifelse(is.na(tmode), NA, distanceI),
durationI = case_when(is.na(tmode) ~ NA, durationI > 180 ~ 180,
TRUE ~ durationI))
# Should result in:
# activity_ID, durationI, distanceI, tmode
# 1, 180, 57, auto
# 2, NA, 41, NA
# 3, 91, 58, rail
当我运行此代码时,它会产生以下错误消息:
Error in `mutate()`:
! Problem while computing `durationI = case_when(is.na(tmode) ~
NA, durationI > 180 ~ 180, TRUE ~ durationI)`.
Caused by error in `` names(message) <- `*vtmp*` ``:
! 'names' attribute [1] must be the same length as the vector [0]
Run `rlang::last_error()` to see where the error occurred.
当我运行rlang::last_error()
时,它同样没有帮助:
<error/dplyr:::mutate_error>
Error in `mutate()`:
! Problem while computing `durationI = case_when(is.na(mode) ~
NA, durationI > 180 ~ 180, TRUE ~ durationI)`.
Caused by error in `` names(message) <- `*vtmp*` ``:
! 'names' attribute [1] must be the same length as the vector [0]
Backtrace:
1. dplyr::mutate(...)
6. dplyr::case_when(...)
7. dplyr:::replace_with(...)
8. dplyr:::check_type(val, x, name, error_call = error_call)
9. rlang::abort(msg, call = error_call)
10. rlang:::signal_abort(cnd, .file)
11. base::signalCondition(cnd)
13. rlang:::conditionMessage.rlang_error(cond)
14. rlang::cnd_message(c)
15. rlang:::cnd_message_format(cnd, ...)
16. cli::cli_format(glue_escape(lines), .envir = emptyenv())
Run `rlang::last_trace()` to see the full context.
如果我检查所有变量的长度,它们当然都是相同的长度。我被难住了。我错过了什么?
您遇到这个问题是因为您试图将逻辑向量和数字向量混合在一起。
在您的case_when
声明中:
case_when(
is.na(tmode) ~ NA,
durationI > 180 ~ 180,
TRUE ~ durationI
)
您的第一个案例评估为NA
。这使得R认为你想要一个逻辑向量。当下一行计算为数字时,您会得到错误。
您可以通过用数字NA_real_
:类型的缺失值替换NA
来修复此问题
raw %>%
mutate(
distanceI = ifelse(is.na(tmode), NA, distanceI),
durationI = case_when(
is.na(tmode) ~ NA_real_,
durationI > 180 ~ 180,
TRUE ~ durationI
)
)
#> # A tibble: 3 × 4
#> activity_ID durationI distanceI tmode
#> <dbl> <dbl> <dbl> <chr>
#> 1 1 180 57 auto
#> 2 2 NA NA <NA>
#> 3 3 91 58 rail
我遇到了一个类似的问题,原因是无意中尝试混合数字和整数类型:
# x was an integer, and I was trying to make it 1 (numeric) if NA
df %>%
mutate(x = case_when(is.na(x) ~ 1)
将1更改为1L解决了此问题。