我希望你能帮我解决这个问题。我想使用python计算持续时间。
数据如下所示
group_id, id, departure_time, arrival_time
1,1,61200,61800
1,2,18000,18600
2,1,32400,34200
2,2,64800,66600
2,3,77400,81000
我想要得到的结果
group_id,id,departure_time, arrival time, departure_rotate, duration
1,1,61200,61800, 18000, 43800
1,2,18000,18600,61200,43800
2,1,32400,34200,77400,43200
2,2,64800,66600,64800, 1800
2,3,77400,81000,32400,48600
实际上,持续时间 = arrival_time - departure_rotate。但是,对于负结果,应添加(24 * 3600(。提前谢谢。到目前为止,我尝试过这个,但没有给出预期的结果。所以我认为我要求的方式会有所帮助。
df_duration = pd.DataFrame(df_trips[[
"group_id", "id", "arrival_time"
]], copy = True)
df_following = pd.DataFrame(df_trips[[
"group_id", "id", "departure_time"
]], copy = True)
df_following.columns = ["group_id", "id", "following_trip_departure_time"]
df_following["trip_id"] = df_following["id"] - 1
df_duration = pd.merge(df_duration, df_following, on = ["group_id", "id"])
假设给定的数据帧df
为:
# df
group_id id departure_time arrival_time
0 1 1 61200 61800
1 1 2 18000 18600
2 2 1 32400 34200
3 2 2 64800 66600
4 2 3 77400 81000
要计算departure_rotate
,可以使用df.groupby
对列group_id
上的数据帧df
进行分组,然后在列departure_time
上应用transform
,从而反转给定组的departure_time
。
要计算duration
,您可以减去arrival_time
和departure_rotate
列以获得diff
,然后使用np.where
根据diff
大于0
或大于duration
将新值分配给列。
用:
import numpy as np
df['departure_rotate'] = (
df.groupby('group_id')['departure_time'].transform(lambda s: s[::-1]))
diff = df['arrival_time'].sub(df['departure_rotate'])
df['duration'] = np.where(diff > 0, diff, diff.add(24*3600))
结果:
# print(df)
group_id id departure_time arrival_time departure_rotate duration
0 1 1 61200 61800 18000 43800
1 1 2 18000 18600 61200 43800
2 2 1 32400 34200 77400 43200
3 2 2 64800 66600 64800 1800
4 2 3 77400 81000 32400 48600