我有visit_date_time
列,它以字符串和unix格式一起包含日期时间,我希望它将其转换为通用的日期时间格式,以进行进一步的功能工程。我如何在Python中做到这一点?
65879802018-05-20 14:19:55.9516587981 2018-05-24 15:53:26.7316587982 2018-05-27 07:55:17.7686587983 2018-05-25 11:28:56.5266587984 2018-05-10 12:23:21.7866587985 2018-05-07 10:08:08.9786587986 2018-05-11 20:50:38.2396587987 2018-05-21 10:21:37.663658798815261391968640000006587989 2018-05-07 14:49:23.2926587990 2018-05-14 21:18:43.1326587991 2018-05-07 10:36:55.8876587992 2018-05-09 05:42:04.9076587993 2018-05-22 09:05:42.3296587994NaN6587995 2018-05-21 07:14:03.2316587996 2018-05-25 09:13:0116587997 NaN6587998 2018-05-20 12:09:35.3476587999 2018-05-17 03:30:22.330
我在同一列中用粗体键入了3种不同的类型。
请考虑使用dput
来提供数据样本。
您可以使用密码lubridate
和dplyr
为所有日期设置一个通用格式:
library(dplyr)
library(lubridate)
data <- data %>%
mutate(visit_date_time = as_datetime(visit_date_time))
函数as_datetime
自动将UNIX转换为可理解的格式。
编辑:这是R答案
示例
df
v t
0 6587987 2018-05-21 10:21:37.663
1 6587988 1526139196864000000
2 6587989 2018-05-07 14:49:23.292
3 6587990 2018-05-14 21:18:43.132
4 6587991 2018-05-07 10:36:55.887
5 6587992 2018-05-09 05:42:04.907
6 6587993 2018-05-22 09:05:42.329
7 6587994 NaN
8 6587995 2018-05-21 07:14:03.231
我们可以使用两步方法;首先解析日期时间字符串,然后解析数字:
# 1) parse strings to datetime
df['datetime'] = pd.to_datetime(df['t'], errors='coerce')
# 2) where we have numeric values
m = pd.to_numeric(df['t'], errors='coerce').notnull()
# parse these to datetime as well; Unix time nanoseconds is default (otherwise, set unit kwarg)
df.loc[m, 'datetime'] = pd.to_datetime(df['t'][m].astype('int64'))
获取
df
v t datetime
0 6587987 2018-05-21 10:21:37.663 2018-05-21 10:21:37.663
1 6587988 1526139196864000000 2018-05-12 15:33:16.864
2 6587989 2018-05-07 14:49:23.292 2018-05-07 14:49:23.292
3 6587990 2018-05-14 21:18:43.132 2018-05-14 21:18:43.132
4 6587991 2018-05-07 10:36:55.887 2018-05-07 10:36:55.887
5 6587992 2018-05-09 05:42:04.907 2018-05-09 05:42:04.907
6 6587993 2018-05-22 09:05:42.329 2018-05-22 09:05:42.329
7 6587994 NaN NaT
8 6587995 2018-05-21 07:14:03.231 2018-05-21 07:14:03.231
导入日期时间
s=1526139196864000000
fmt="%Y-%m-%d%H:%m:%S:%m";
s=浮动/1000000
t_utc=datetime.datetime.utcromtimestamp(s/1000(
打印(t_utc.strftime(fmt((