我很难将csv文件中分开的日期和时间列转换为合并的数据帧datetime列。
原始数据:
Date Time
0 2014/9/2 08:30:00.0
1 2014/9/2 08:37:39.21
2 2014/9/2 08:39:41.2
3 2014/9/2 08:41:23.9
4 2014/9/2 09:13:01.1
5 2014/9/2 09:43:02.49
6 2014/9/2 10:49:16.115
7 2014/9/2 10:58:46.39
8 2014/9/2 11:46:18.5
9 2014/9/2 12:03:43.0
10 2014/9/2 12:56:22.0
11 2014/9/2 13:13:01.0
12 2014/9/2 14:42:22.39
13 2014/9/2 14:50:00.74
14 2014/9/3 08:30:00.0
15 2014/9/3 08:30:11.57
16 2014/9/3 08:39:02.18
17 2014/9/3 08:44:31.74
18 2014/9/3 08:45:16.105
19 2014/9/3 08:47:52.57
连接日期+时间列
df['datetime'] = df.Date + str(' ') + df.Time
0 2014/9/2 08:30:00.0
1 2014/9/2 08:37:39.21
2 2014/9/2 08:39:41.2
3 2014/9/2 08:41:23.9
4 2014/9/2 09:13:01.1
5 2014/9/2 09:43:02.49
6 2014/9/2 10:49:16.115
7 2014/9/2 10:58:46.39
8 2014/9/2 11:46:18.5
9 2014/9/2 12:03:43.0
尝试将字符串解析为datetime对象:
df['datetime'] = df['datetime'].apply(lambda x: datetime.strptime(x, '%Y/%m/%d %H:%M:%S.f%'))
失败:
ValueError: stray % in format '%Y/%m/%d %H:%M:%S.f%'
这有什么问题,如何解决?
根据documentation
,微秒的格式代码是%f
而不是f%
。
Try this:
df['datetime'] = df['datetime'].apply(lambda x: datetime.strptime(x, '%Y/%m/%d %H:%M:%S.%f'))
或者,一次性:
(
pd.read_csv("test.csv")
.astype(str).agg(" ".join, axis=1)
.to_frame("datetime")
.apply(lambda _: pd.to_datetime(_, format= '%Y/%m/%d %H:%M:%S.%f'))
)
#输出:
datetime
0 2014-09-02 08:30:00.000
1 2014-09-02 08:37:39.210
2 2014-09-02 08:39:41.200
3 2014-09-02 08:41:23.900
4 2014-09-02 09:13:01.100
.. ...
15 2014-09-03 08:30:11.570
16 2014-09-03 08:39:02.180
17 2014-09-03 08:44:31.740
18 2014-09-03 08:45:16.105
19 2014-09-03 08:47:52.570
[20 rows x 1 columns]
#dtypes
datetime datetime64[ns]
dtype: object