r-检查一系列日期是否在一系列不同的间隔内



这似乎是一件简单的事情,但我被难住了。

我使用的是tidyverse材料作为指南:这里是

我有一个衰退时间段的列表,我想创建一个数据框架作为输出,列出每个日期以及该日期是否处于衰退中。我想保留dplyr格式的解决方案。

这是一个可复制的例子

library(lubridate)
library(tidyverse)
# Sample data set
my_df <-
structure(list(recession_start = structure(c(1400, 3652, 4199, 
7486, 11382, 13848), class = "Date"), recession_end = structure(c(1885, 
3834, 4687, 7729, 11627, 14396), class = "Date"), recession_interval = new("Interval", 
.Data = c(41904000, 15724800, 42163200, 20995200, 21168000, 
47347200), start = structure(c(120960000, 315532800, 362793600, 
646790400, 983404800, 1196467200), tzone = "UTC", class = c("POSIXct", 
"POSIXt")), tzone = "UTC")), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

> my_df
# A tibble: 6 x 3
recession_start recession_end recession_interval            
<date>          <date>        <Interval>                    
1 1973-11-01      1975-03-01    1973-11-01 UTC--1975-03-01 UTC
2 1980-01-01      1980-07-01    1980-01-01 UTC--1980-07-01 UTC
3 1981-07-01      1982-11-01    1981-07-01 UTC--1982-11-01 UTC
4 1990-07-01      1991-03-01    1990-07-01 UTC--1991-03-01 UTC
5 2001-03-01      2001-11-01    2001-03-01 UTC--2001-11-01 UTC
6 2007-12-01      2009-06-01    2007-12-01 UTC--2009-06-01 UTC

# Get every day in the range of dates
my_dates <- seq(first(my_df$recession_start), today(), by = "day")

# Create a list of intervals
recession_intervals <- list(my_df$recession_interval)

# Check to see if `my_dates` are in the intervals
recession <- my_dates %within% recession_intervals  # Throws warning and does not give expected results

我怀疑这是因为我的日期列表是单个列表,而不是tidyverse示例中的多个列表,但我不知道如何手动创建多个列表。

期望的输出将是具有每个日期的数据帧;TRUE";或";FALSE";列,指示该每日日期是否处于衰退区间。类似于:

recession_df <- data.frame(Date = my_dates, recession = recession) 

输出如下:

Date recession
1  1973-11-01      TRUE
2  1973-11-02      TRUE
3  1973-11-03      TRUE
4  1973-11-04      TRUE
5  1973-11-05      TRUE
6  1973-11-06      TRUE
7  1973-11-07      TRUE
8  1973-11-08      TRUE
9  1973-11-09      TRUE
10 1973-11-10      TRUE

谢谢你的帮助!

一个选项是循环(map("my_dates",检查是否有any日期是%within%"recession_interval"列,使用每个"date"和逻辑输出创建一个tibble,并使用_dfr(行绑定(转换为单个数据集

library(purrr)
out <- map_dfr(my_dates, ~ tibble(Date = .x, 
recession = any(Date %within% my_df$recession_interval)))

-输出

# A tibble: 17,381 x 2
Date       recession
<date>     <lgl>    
1 1973-11-01 TRUE     
2 1973-11-02 TRUE     
3 1973-11-03 TRUE     
4 1973-11-04 TRUE     
5 1973-11-05 TRUE     
6 1973-11-06 TRUE     
7 1973-11-07 TRUE     
8 1973-11-08 TRUE     
9 1973-11-09 TRUE     
10 1973-11-10 TRUE     
# … with 17,371 more rows

这对我有效:

in_recession <- 
tibble(date = my_dates) %>% 
mutate(
recession = date %>% 
map_lgl(~any(.x %within% my_df$recession_interval))
)

我们也可以使用以下解决方案,而不使用recession_interval列:

library(purrr)
my_dates %>%
as_tibble() %>%
rowwise() %>%
mutate(fall = any(map2_lgl(my_df$recession_start, my_df$recession_end, 
~ between(value, .x, .y))))
# A tibble: 17,382 x 2
# Rowwise: 
value      fall 
<date>     <lgl>
1 1973-11-01 TRUE 
2 1973-11-02 TRUE 
3 1973-11-03 TRUE 
4 1973-11-04 TRUE 
5 1973-11-05 TRUE 
6 1973-11-06 TRUE 
7 1973-11-07 TRUE 
8 1973-11-08 TRUE 
9 1973-11-09 TRUE 
10 1973-11-10 TRUE 
# ... with 17,372 more rows

最新更新