我想在R中完成一个df,当它错过了一个月的日期时,例如,如果我有一年的信息,就像这样。
df = data.frame(Date = c("2020-01-01","2020-02-01",
"2020-03-02","2020-04-01","2020-09-01","2020-10-01",
"2020-11-01","2020-12-01"))
当我使用完整的功能时,我会像这个一样使用它
df = df%>%
mutate(Date = as.Date(Date)) %>%
complete(Date= seq.Date("2020-01-01", "2020-12-31", by = "month"))
问题是,我的最后一次df完成了5月、6月、7月等所有日期,这没关系,但也完成了3月,因为3月没有第一天,开始于2020-03-02年。
df = data.frame(Date = c("2020-01-01","2020-02-01",
"2020-03-01","2020-03-02","2020-04-01","2020-05-01",
"2020-06-01","2020-07-01","2020-08-01","2020-09-01",
"2020-10-01","2020-11-01","2020-12-01"))
你知道只有在df没有任何日期的情况下才能完成df吗?
就我而言,我不想完成游行,因为游行已经有日期了。
非常感谢。
您可以从Date中提取年份和月份值,并在其中使用complete
。
library(dplyr)
library(lubridate)
library(tidyr)
df %>%
mutate(Date = as.Date(Date),
year = year(Date),
month = month(Date)) %>%
complete(year, month = 1:12) %>%
mutate(Date = if_else(is.na(Date),
as.Date(paste(year, month, 1, sep = '-')), Date)) %>%
select(Date)
# Date
# <date>
# 1 2020-01-01
# 2 2020-02-01
# 3 2020-03-02
# 4 2020-04-01
# 5 2020-05-01
# 6 2020-06-01
# 7 2020-07-01
# 8 2020-08-01
# 9 2020-09-01
#10 2020-10-01
#11 2020-11-01
#12 2020-12-01
一个可能的解决方案是仅在zoo
包中的yearmon
之前完成,因此它与当月的实际日期无关。
library(dplyr)
library(zoo) # for as.yearmon
library(tidyr) # for complete
df <- data.frame(Date = c("2020-01-01","2020-02-01",
"2020-03-02","2020-04-01",
"2020-09-01","2020-10-01",
"2020-11-01","2020-12-01"),
id = 1:8)
df
#> Date id
#> 1 2020-01-01 1
#> 2 2020-02-01 2
#> 3 2020-03-02 3
#> 4 2020-04-01 4
#> 5 2020-09-01 5
#> 6 2020-10-01 6
#> 7 2020-11-01 7
#> 8 2020-12-01 8
df %>%
mutate(Date = as.Date(Date),
year_mon = as.yearmon(Date)) %>%
complete(
year_mon = seq.Date(as.Date("2020-01-01"),
as.Date("2020-12-31"),
by = "month") %>% as.yearmon()
)
#> # A tibble: 12 x 3
#> year_mon Date id
#> <yearmon> <date> <int>
#> 1 Jan 2020 2020-01-01 1
#> 2 Feb 2020 2020-02-01 2
#> 3 Mar 2020 2020-03-02 3
#> 4 Apr 2020 2020-04-01 4
#> 5 May 2020 NA NA
#> 6 Jun 2020 NA NA
#> 7 Jul 2020 NA NA
#> 8 Aug 2020 NA NA
#> 9 Sep 2020 2020-09-01 5
#> 10 Oct 2020 2020-10-01 6
#> 11 Nov 2020 2020-11-01 7
#> 12 Dec 2020 2020-12-01 8
创建于2021-06-25由reprex包(v2.0.0(