我想减去两个日期时间值,输出格式为HH:MM:SS
如果差值大于一天,则需要将天数加到小时数中
我有两列:started_at
和ended_at
我尝试创建一个新的列trip_duration
:
df['trip_duration'] = df['ended_at'] - df['started_at']
样本表:
| stated_at | ended_at |
| -------- | -------- |
| 2022-08-18 18:16:28+00:00 | 2022-08-18 19:20:28+00:00 |
| 2022-10-07 14:21:58+00:00 | 2022-10-07 14:41:58+00:00 |
| 2022-10-10 14:21:58+00:00 | 2022-10-11 02:21:58+00:00 |
注意最后一行的日期是不同的。开始日期:2022-10-10,结束日期:2022-10-11
我认为我应该添加一些条件,当情况发生。我的意思是,当时间更小时(02-21-58+00:00 <14-21-58 + 0),但它来自未来的另一天。
期望的输出是:
trip_duration | 01:04:00 |
---|
00:20:00 |
36:00:00 |
import pandas as pd
# Create a sample dataframe
df = pd.DataFrame({'started_at': ['2022-08-18 18:16:28+00:00', '2022-10-07 14:21:58+00:00', '2022-10-10 14:21:58+00:00'],
'ended_at': ['2022-08-18 19:20:28+00:00', '2022-10-07 14:41:58+00:00', '2022-10-11 02:21:58+00:00']})
# Convert the columns to datetime objects
df['started_at'] = pd.to_datetime(df['started_at'])
df['ended_at'] = pd.to_datetime(df['ended_at'])
# Create a new column 'trip_duration'
df['trip_duration'] = df['ended_at'] - df['started_at']
# Extract hours, minutes, seconds and format as string
df['trip_duration'] = df['trip_duration'].apply(lambda x: x.seconds // 3600 + x.days*24)
df['trip_duration'] = df['trip_duration'].apply(lambda x: '{:02d}:{:02d}:{:02d}'.format(x // 3600, (x % 3600) // 60, x % 60))
# Print the resulting dataframe
print(df)
编辑:修复第二个df['trip_duration']不是Timedelta
对象的错误。