在 Pandas 数据帧中计算两个日期时间戳 [YYYY-MM-DD HH:MM:SS.000] 列之间的差异时出错(以



我有一个数据帧,它有两列"Etime"和"Stime",其中包含结束和开始时间戳,其示例如下所示:

df = pd.DataFrame({'Etime': ['2019-08-23 00:00:06.773', '2019-09-19 00:00:16.083', '2019-08-29 00:00:07.043', '2019-10-01 00:00:14.777','2019-08-15 00:00:57.050'],
'Stime': ['2019-08-22 23:59:41.983', '2019-09-18 23:59:44.007', '2019-08-28 23:59:02.863', '2019-09-30 23:59:05.187', '2019-08-14 23:59:20.217']})

我想做的是创建另一列"持续时间",它应该包含以秒为单位的开始和结束时间的差异,最终数据集应如下所示:

Etime                        Stime                      Duration
2019-08-23  00:00:06.773     2019-08-22 23:59:41.983    25
2019-09-19  00:00:16.083     2019-09-18 23:59:44.007    32
2019-08-29  00:00:07.043     2019-08-28 23:59:02.863    04
2019-10-01  00:00:14.777     2019-09-30 23:59:05.187    10
2019-08-15  00:00:57.050     2019-08-14 23:59:20.217    37

我想做的是:

df['STS'] = pd.to_timedelta(pd.to_datetime(df['Stime']).dt.time.astype(str)).dt.total_seconds()
df['EDS'] = pd.to_timedelta(pd.to_datetime(df['Etime']).dt.time.astype(str)).dt.total_seconds()
df['Duration'] = round(df['EDS'] - df['STS'], 0)

这给了我错误的输出,如下所示:

Etime                      Stime                         Duration
2019-08-23 00:00:06.773    2019-08-22 23:59:41.983      -86375
2019-09-19 00:00:16.083    2019-09-18 23:59:44.007      -86368
2019-08-29 00:00:07.043    2019-08-28 23:59:02.863      -86336
2019-10-01 00:00:14.777    2019-09-30 23:59:05.187      -86330
2019-08-15 00:00:57.050    2019-08-14 23:59:20.217      -86303

我在这里做错了什么?

有更好的方法吗?

试试这个:

date_format = '%Y-%m-%d %H:%M:%S.%f'
df['Duration'] = [(datetime.strptime(df.loc[x,'Etime'], date_format ) -
datetime.strptime(df.loc[x,'Stime'], date_format)).seconds
for x in df.index]

最新更新