如何仅将字符串的一部分处理为r中的日期



我有一些来自api的数据,该数据以一种不寻常的格式提供时间戳,其中包括在末尾的星期几和一年中的星期几。例如:[2021,8,22,22,0,20,6,234]是一周的第6天,一年的第234天,2021/08/22 22:00:20。我想把它转换成一个润滑日期-时间对象,但不知道如何去掉最后两个值。

例如,我想取这个数据

example <- tibble(timestamp = c("[2021, 8, 22, 22, 0, 20, 6, 234]", "[2021, 8, 22, 22, 0, 30, 6, 234]", "[2021, 8, 22, 22, 0, 41, 6, 234]"), temperature = c(28,29,30)),将时间戳列转换为润滑日期-时间类型。什么好主意吗?

您可以使用strptime,然后提供适当的格式字符串

example %>% dplyr::mutate(
datetime = strptime(timestamp, format = "[%Y, %m, %d, %H, %M, %S"))
# A tibble: 3 x 3
timestamp                        temperature datetime           
<chr>                                  <dbl> <dttm>             
1 [2021, 8, 22, 22, 0, 20, 6, 234]          28 2021-08-22 22:00:20
2 [2021, 8, 22, 22, 0, 30, 6, 234]          29 2021-08-22 22:00:30
3 [2021, 8, 22, 22, 0, 41, 6, 234]          30 2021-08-22 22:00:41

这个怎么样

library(tidyverse)
example <- tibble(timestamp = c("[2021, 8, 22, 22, 0, 20, 6, 234]", "[2021, 8, 22, 22, 0, 30, 6, 234]", "[2021, 8, 22, 22, 0, 41, 6, 234]"), temperature = c(28,29,30))
example %>%
mutate(timestamp = str_split(timestamp, ","),
timestamp = map_chr(timestamp, ~paste(parse_number(.x[1:6]), collapse = ".")),
timestamp = lubridate::ymd_hms(timestamp))
#> # A tibble: 3 x 2
#>   timestamp           temperature
#>   <dttm>                    <dbl>
#> 1 2021-08-22 22:00:20          28
#> 2 2021-08-22 22:00:30          29
#> 3 2021-08-22 22:00:41          30

我只是拆分列表,解析数字以删除括号,然后折叠列表以省略最后两个元素,最后解析日期时间。

最新更新