将两列合并为一列—日期列和时间列,同时更改它们的格式



数据的结构

str(noaaData)
'data.frame':   92 obs. of  3 variables:
$ Date         : int  20121029 20121029 20121029 20121029 20121029 20121029 20121029 20121029 20121029 20121029 ...
$ Time         : int  100 200 300 400 500 600 700 800 900 1000 ...
$ AtmosPressure: num  999 999 998 998 997 ...

我需要结合日期和;将名称为"DateTime"的一列中的时间更改为POSIXct格式"%Y-%m-%d%H:%m:%OS">

它应该看起来像这个

2012-10-29 01:00:00

最小可重复

dput(noaaData)
structure(list(Date = c(20121029L, 20121029L, 20121029L, 20121029L, 
20121029L, 20121029L, 20121029L, 20121029L, 20121029L, 20121029L, 
20121029L, 20121029L, 20121029L, 20121029L, 20121029L, 20121029L, 
20121029L, 20121029L, 20121029L, 20121029L, 20121029L, 20121029L, 
20121029L, 20121030L, 20121030L, 20121030L, 20121030L, 20121030L, 
20121030L, 20121030L, 20121030L, 20121030L, 20121030L, 20121030L, 
20121030L, 20121030L, 20121030L, 20121030L, 20121030L, 20121030L, 
20121030L, 20121030L, 20121030L, 20121030L, 20121031L, 20121031L, 
20121031L, 20121031L, 20121031L, 20121031L, 20121031L, 20121031L, 
20121031L, 20121031L, 20121031L, 20121031L, 20121031L, 20121031L, 
20121031L, 20121031L, 20121031L, 20121031L, 20121031L, 20121031L, 
20121031L, 20121031L, 20121031L, 20121031L, 20121101L, 20121101L, 
20121101L, 20121101L, 20121101L, 20121101L, 20121101L, 20121101L, 
20121101L, 20121101L, 20121101L, 20121101L, 20121101L, 20121101L, 
20121101L, 20121101L, 20121101L, 20121101L, 20121101L, 20121101L, 
20121101L, 20121101L, 20121101L, 20121101L), Time = structure(c(8640000, 
17280000, 25920000, 34560000, 43200000, 51840000, 60480000, 69120000, 
77760000, 86400000, 95040000, 103680000, 112320000, 120960000, 
129600000, 138240000, 146880000, 155520000, 164160000, 172800000, 
181440000, 190080000, 198720000, 25920000, 34560000, 43200000, 
51840000, 60480000, 69120000, 77760000, 86400000, 95040000, 103680000, 
112320000, 120960000, 129600000, 138240000, 146880000, 155520000, 
164160000, 172800000, 181440000, 190080000, 198720000, 0, 8640000, 
17280000, 25920000, 34560000, 43200000, 51840000, 60480000, 69120000, 
77760000, 86400000, 95040000, 103680000, 112320000, 120960000, 
129600000, 138240000, 146880000, 155520000, 164160000, 172800000, 
181440000, 190080000, 198720000, 0, 8640000, 17280000, 25920000, 
34560000, 43200000, 51840000, 60480000, 69120000, 77760000, 86400000, 
95040000, 103680000, 112320000, 120960000, 129600000, 138240000, 
146880000, 155520000, 164160000, 172800000, 181440000, 190080000, 
198720000), class = c("POSIXct", "POSIXt")), AtmosPressure = c(999.2, 
999.2, 998.3, 997.9, 996.7, 995.4, 994.4, 992, 990.2, 988.4, 
987.1, 985.9, 984, 981.5, 978.8, 975.3, 971.4, 966.3, 961.9, 
956.1, 951.4, 949.7, 946.3, 968, 972.9, 976.5, 979.4, 981.2, 
982.5, 983.7, 985, 986.2, 987.5, 988.9, 990.1, 991, 992, 992.6, 
993, 993.4, 994.1, 994.6, 995.4, 996, 996.6, 997, 997.1, 997.3, 
997.3, 997.4, 997.2, 997.2, 997.5, 997.8, 998.3, 998.6, 999, 
999.8, 1000, 1000.1, 1000.2, 999.7, 999.5, 999.1, 999.5, 1000, 
1000.4, 1001.1, 1001.3, 1001.6, 1001.7, 1001.8, 1001.7, 1001.7, 
1001.7, 1002, 1001.9, 1001.9, 1002.1, 1002, 1002.5, 1002.9, 1003.1, 
1003, 1003.1, 1002.5, 1002.1, 1001.8, 1001.9, 1001.9, 1002.2, 
1002.9)), row.names = c(NA, -92L), class = "data.frame")

我请求有人在这方面帮助我。

可能有更清洁的解决方案,但这应该有效且灵活。。。

1.可重复性最小的示例:

df <- data.frame(Date = c(20121029,20121029, 20121029, 20121029, 20121029, 20121029, 20121029),
Time = c(400, 500, 600, 700, 800, 900, 1000),
stringsAsFactors = FALSE)
'data.frame':   7 obs. of  2 variables:
$ Date         : int  20121029 20121029 20121029 20121029 20121029 20121029 20121029
$ Time         : int  400 500 600 700 800 900 1000 

2.具有助手功能和strptime:的解决方案

leftPad <- function(x) {
len <- 4
s <- sapply(sapply(x, function(y) paste0("0000", y)),
function(z) substr(z, nchar(z) - len + 1, nchar(z)))
return(as.character(s))
}

leftPad可以用于为strptime格式化Time(我们可以使用tz参数更改时区

strptime(paste0(df$Date, " ", leftPad(df$Time)), format = "%Y%m%d %H%M", tz="UTC")

返回:

[1] "2012-10-29 04:00:00 UTC" "2012-10-29 05:00:00 UTC"
[3] "2012-10-29 06:00:00 UTC" "2012-10-29 07:00:00 UTC"
[5] "2012-10-29 08:00:00 UTC" "2012-10-29 09:00:00 UTC"
[7] "2012-10-29 10:00:00 UTC"

这里有一种方法:

noaaData <- data.frame(
Date = c(20121029,20121029, 20121029, 20121029, 20121029, 20121029, 20121029),
Time = c(400, 500, 600, 700, 800, 900, 1000),
stringsAsFactors = FALSE
)
strptime(
paste(noaaData$Date, formatC(noaaData$Time, width = 4, format = "d", flag = "0")), 
format = "%Y%m%d %H%M"
)
[1] "2012-10-29 04:00:00 CET" "2012-10-29 05:00:00 CET" "2012-10-29 06:00:00 CET" "2012-10-29 07:00:00 CET" "2012-10-29 08:00:00 CET"
[6] "2012-10-29 09:00:00 CET" "2012-10-29 10:00:00 CET"

您可能想在strptime中更改时区tz

最新更新