如何得到R中三个日期中的中间日期?

我有一个具有三个日期列x, y和z的数据表，我正在尝试创建一个新列(new_col)，这是每一行中三个日期的中间日期，从最早到最晚，即，我想要最小和最大日期之间的日期-请参见下表:

<表类> x y znew_col 2005年1月1日 1998年5月4日 2009年3月2日 2005年1月1日 2010年5月9日 2003年2月14日 2008年1月9日 2008年1月9日 2002年9月7日 2010年12月8日 2012年5月23日 2010年12月8日

下面的方法

将字符日期字符串强制转换为数字类型Date，因为字符日期没有算术，
查找"中间"的位置。日期每行
并返回相应的字符串
最终变成new_col

这可以使用apply()在每一行上使用适当的函数来实现:

df$new_col <- apply(df, 1L, function(x) x[order(lubridate::dmy(x))][2L])
df

x             y             z      new_col
1  1st Jan 2005  4th May 1998  2nd Mar 2009 1st Jan 2005
2  9th May 2010 14th Feb 2003  9th Jan 2008 9th Jan 2008
3 7th Sept 2002  8th Dec 2010 23rd May 2012 8th Dec 2010

注意

返回预期的结果。new_col为字符日期字符串。

然而，如果OP打算继续使用Date类型，例如做更多的算术，我建议遵循Ben的例子，将整个data.frame强制为Date类型，并坚持使用它。

首先确保你所有的日期都是"日期";类型，您可以使用lubridate中的dmy(假设您的数据帧称为df):

library(lubridate)
df[] <- lapply(df, dmy)

接下来，按时间顺序对每一行进行排序，并将中间一列(第2列)作为new_col:

df$new_col <- as.Date(t(apply(df, 1, sort))[,2])

最后，如果您希望结果以相同的文本格式显示(例如，"1st Jan 2005"而不是"2005-01-01")，那么您可以使用基于以下答案的自定义函数:

library(dplyr)
date_to_text <- function(dates){
dayy <- day(dates)
suff <- case_when(dayy %in% c(11,12,13) ~ "th",
dayy %% 10 == 1 ~ 'st',
dayy %% 10 == 2 ~ 'nd',
dayy %% 10 == 3 ~'rd',
TRUE ~ "th")
paste0(dayy, suff, " ", format(dates, "%b %Y"))
}
df[] <- lapply(df, date_to_text)

x             y             z      new_col
1 1st Jan 2005  4th May 1998  2nd Mar 2009 1st Jan 2005
2 9th May 2010 14th Feb 2003  9th Jan 2008 9th Jan 2008
3 7th Sep 2002  8th Dec 2010 23rd May 2012 8th Dec 2010

df <- structure(list(x = c("1st Jan 2005", "9th May 2010", "7th Sept 2002"
), y = c("4th May 1998", "14th Feb 2003", "8th Dec 2010"), z = c("2nd Mar 2009", 
"9th Jan 2008", "23rd May 2012")), class = "data.frame", row.names = c(NA, 
-3L))

注意

相关内容

最新更新

热门标签：