r语言 - 'dplyr::between()' 中的错误:'left'长度必须为 1



我有一个来自两个气象站的温度和时间数据库,看起来像这样:

# A tibble: 6 × 7
Station Date       Time     Temperature  Tmin  Tmed  Tmax
<chr>   <date>     <time>         <dbl> <dbl> <dbl> <dbl>
1 F       2021-10-15 00:11:46        16.8  15.2  17.1  20.4
2 F       2021-10-15 00:41:46        16.5  15.2  17.1  20.4
3 F       2021-10-15 01:11:46        16.2  15.2  17.1  20.4
4 F       2021-10-15 01:41:46        15.6  15.2  17.1  20.4
5 F       2021-10-15 02:11:46        15.9  15.2  17.1  20.4
6 F       2021-10-15 02:41:46        16.1  15.2  17.1  20.4

以下是通过dput():获得的前两天的可复制示例(对不起,我知道这是一团糟(

structure(list(Station = c("F", "F", "F", "F", "F", "F", "F", 
"F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", 
"F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", 
"F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", 
"F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", 
"F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", 
"F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", 
"F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F"), Date = structure(c(18915, 
18915, 18915, 18915, 18915, 18915, 18915, 18915, 18915, 18915, 
18915, 18915, 18915, 18915, 18915, 18915, 18915, 18915, 18915, 
18915, 18915, 18915, 18915, 18915, 18915, 18915, 18915, 18915, 
18915, 18915, 18915, 18915, 18915, 18915, 18915, 18915, 18915, 
18915, 18915, 18915, 18915, 18915, 18915, 18915, 18915, 18915, 
18915, 18915, 18916, 18916, 18916, 18916, 18916, 18916, 18916, 
18916, 18916, 18916, 18916, 18916, 18916, 18916, 18916, 18916, 
18916, 18916, 18916, 18916, 18916, 18916, 18916, 18916, 18916, 
18916, 18916, 18916, 18916, 18916, 18916, 18916, 18916, 18916, 
18916, 18916, 18916, 18916, 18916, 18916, 18916, 18916, 18916, 
18916, 18916, 18916, 18916, 18916), class = "Date"), Time = structure(c(706, 
2506, 4306, 6106, 7906, 9706, 11506, 13306, 15106, 16906, 18706, 
20506, 22306, 24106, 25906, 27706, 29506, 31306, 33106, 34906, 
36706, 38506, 40306, 42106, 43906, 45706, 47506, 49306, 51106, 
52906, 54706, 56506, 58306, 60106, 61906, 63706, 65506, 67306, 
69106, 70906, 72706, 74506, 76306, 78106, 79906, 81706, 83506, 
85306, 706, 2506, 4306, 6106, 7906, 9706, 11506, 13306, 15106, 
16906, 18706, 20506, 22306, 24106, 25906, 27706, 29506, 31306, 
33106, 34906, 36706, 38506, 40306, 42106, 43906, 45706, 47506, 
49306, 51106, 52906, 54706, 56506, 58306, 60106, 61906, 63706, 
65506, 67306, 69106, 70906, 72706, 74506, 76306, 78106, 79906, 
81706, 83506, 85306), class = c("hms", "difftime"), units = "secs"), 
Temperature = c(16.8, 16.5, 16.2, 15.6, 15.9, 16.1, 16.4, 
16.2, 16, 16, 16.2, 16.2, 15.9, 16, 16, 16.4, 16.2, 16.5, 
16.1, 16.4, 16.8, 16.6, 18.6, 16.9, 18.6, 19.5, 18.5, 18.5, 
20.4, 19.1, 19.8, 19.7, 18.1, 17.4, 17.4, 16.9, 15.8, 16.8, 
16.9, 16.8, 17, 15.2, 16.2, 17.4, 18.1, 18.3, 18, 17.9, 17.6, 
17.9, 17.7, 17.7, 17.7, 17.8, 18.1, 18.3, 18.1, 16.2, 18, 
18.8, 18.6, 19.1, 18.9, 17.9, 16.2, 17.3, 19.3, 20.2, 20.7, 
20.9, 22.2, 22.3, 21.2, 21.1, 20.1, 23.3, 21.4, 20.2, 19.8, 
18.9, 19.8, 20.1, 20.4, 19.5, 18.8, 18, 17.9, 17.9, 17.8, 
18, 17.9, 16.5, 16.8, 16.5, 16.7, 16.7), Tmin = c(15.2, 15.2, 
15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 
15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 
15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 
15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 
15.2, 15.2, 15.2, 15.2, 15.2, 15.2, 16.2, 16.2, 16.2, 16.2, 
16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 
16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 
16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 
16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 16.2, 
16.2, 16.2, 16.2, 16.2), Tmed = c(17.1, 17.1, 17.1, 17.1, 
17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 
17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 
17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 
17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 17.1, 
17.1, 17.1, 17.1, 17.1, 18.8083333333333, 18.8083333333333, 
18.8083333333333, 18.8083333333333, 18.8083333333333, 18.8083333333333, 
18.8083333333333, 18.8083333333333, 18.8083333333333, 18.8083333333333, 
18.8083333333333, 18.8083333333333, 18.8083333333333, 18.8083333333333, 
18.8083333333333, 18.8083333333333, 18.8083333333333, 18.8083333333333, 
18.8083333333333, 18.8083333333333, 18.8083333333333, 18.8083333333333, 
18.8083333333333, 18.8083333333333, 18.8083333333333, 18.8083333333333, 
18.8083333333333, 18.8083333333333, 18.8083333333333, 18.8083333333333, 
18.8083333333333, 18.8083333333333, 18.8083333333333, 18.8083333333333, 
18.8083333333333, 18.8083333333333, 18.8083333333333, 18.8083333333333, 
18.8083333333333, 18.8083333333333, 18.8083333333333, 18.8083333333333, 
18.8083333333333, 18.8083333333333, 18.8083333333333, 18.8083333333333, 
18.8083333333333, 18.8083333333333), Tmax = c(20.4, 20.4, 
20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 
20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 
20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 
20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 
20.4, 20.4, 20.4, 20.4, 20.4, 20.4, 23.3, 23.3, 23.3, 23.3, 
23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 
23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 
23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 
23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 23.3, 
23.3, 23.3, 23.3, 23.3)), row.names = c(NA, -96L), class = c("tbl_df", 
"tbl", "data.frame"))

我想加一列,告诉我在给定时间的温度是否接近每日最低温度。

最好的方法似乎是dplyr::between函数,我试着这样写:

TimeTempReprod %>% 
group_by(Date, Station) %>%
mutate(y = between(Temperature, Tmin, Tmin + 2))

当我运行此代码时,我在控制台中得到的是:

Error in `mutate()`:
! Problem while computing `y = dplyr::between(Temperature, Tmin, Tmin + 2)`.
ℹ The error occurred in group 1: Date = 2021-10-15, Station = "F".
Caused by error in `dplyr::between()`:
! `left` must be length 1

我试图寻找这个问题的答案,但在其他地方找不到与between函数相关的答案。。。

我希望这个问题可以理解,如果有问题,我很抱歉。这是我学习了两年后发布到stackexchange的第一个问题,所以我仍然需要学习如何正确使用它。感谢谁会抽出时间来帮助我!

您需要捕获一个值,而Tmin捕获每组值的整个向量,因此要解决此问题,您可以使用从向量中取出一个值的函数。由于矢量由相同的值组成,因此许多函数都可以工作,例如minfirst:

TimeTempReprod %>% 
group_by(Date, Station) %>%
mutate(y = between(Temperature, min(Tmin), min(Tmin) + 2))

发出:

# A tibble: 96 × 8
# Groups:   Date, Station [2]
Station Date       Time     Temperature  Tmin  Tmed  Tmax y    
<chr>   <date>     <time>         <dbl> <dbl> <dbl> <dbl> <lgl>
1 F       2021-10-15 00:11:46        16.8  15.2  17.1  20.4 TRUE 
2 F       2021-10-15 00:41:46        16.5  15.2  17.1  20.4 TRUE 
3 F       2021-10-15 01:11:46        16.2  15.2  17.1  20.4 TRUE 
4 F       2021-10-15 01:41:46        15.6  15.2  17.1  20.4 TRUE 
5 F       2021-10-15 02:11:46        15.9  15.2  17.1  20.4 TRUE 
6 F       2021-10-15 02:41:46        16.1  15.2  17.1  20.4 TRUE 
7 F       2021-10-15 03:11:46        16.4  15.2  17.1  20.4 TRUE 
8 F       2021-10-15 03:41:46        16.2  15.2  17.1  20.4 TRUE 
9 F       2021-10-15 04:11:46        16    15.2  17.1  20.4 TRUE 
10 F       2021-10-15 04:41:46        16    15.2  17.1  20.4 TRUE 
# … with 86 more rows

我遇到了类似的问题,但我不想对数据进行分组。我想比较A列的值是否在B列和C列之间。

df %>%
mutate(is_between = between(A, B, C))

然而,这导致了一个类似的错误,将我带到了这个线程。

解决方案是执行逐行计算:

df %>%
rowwise() %>%
mutate(is_between = between(A, B, C)) %>%
ungroup() # Removes row-wise calculation mode

这产生了预期的结果。

相关内容

最新更新