我有一个数据帧,它有两列"Etime"和"Stime",其中包含结束和开始时间戳,其示例如下所示:
df = pd.DataFrame({'Etime': ['2019-08-23 00:00:06.773', '2019-09-19 00:00:16.083', '2019-08-29 00:00:07.043', '2019-10-01 00:00:14.777','2019-08-15 00:00:57.050'],
'Stime': ['2019-08-22 23:59:41.983', '2019-09-18 23:59:44.007', '2019-08-28 23:59:02.863', '2019-09-30 23:59:05.187', '2019-08-14 23:59:20.217']})
我想做的是创建另一列"持续时间",它应该包含以秒为单位的开始和结束时间的差异,最终数据集应如下所示:
Etime Stime Duration
2019-08-23 00:00:06.773 2019-08-22 23:59:41.983 25
2019-09-19 00:00:16.083 2019-09-18 23:59:44.007 32
2019-08-29 00:00:07.043 2019-08-28 23:59:02.863 04
2019-10-01 00:00:14.777 2019-09-30 23:59:05.187 10
2019-08-15 00:00:57.050 2019-08-14 23:59:20.217 37
我想做的是:
df['STS'] = pd.to_timedelta(pd.to_datetime(df['Stime']).dt.time.astype(str)).dt.total_seconds()
df['EDS'] = pd.to_timedelta(pd.to_datetime(df['Etime']).dt.time.astype(str)).dt.total_seconds()
df['Duration'] = round(df['EDS'] - df['STS'], 0)
这给了我错误的输出,如下所示:
Etime Stime Duration
2019-08-23 00:00:06.773 2019-08-22 23:59:41.983 -86375
2019-09-19 00:00:16.083 2019-09-18 23:59:44.007 -86368
2019-08-29 00:00:07.043 2019-08-28 23:59:02.863 -86336
2019-10-01 00:00:14.777 2019-09-30 23:59:05.187 -86330
2019-08-15 00:00:57.050 2019-08-14 23:59:20.217 -86303
我在这里做错了什么?
有更好的方法吗?
试试这个:
date_format = '%Y-%m-%d %H:%M:%S.%f'
df['Duration'] = [(datetime.strptime(df.loc[x,'Etime'], date_format ) -
datetime.strptime(df.loc[x,'Stime'], date_format)).seconds
for x in df.index]