Pandas—使用来自前一行的信息为每一行计算一个值?

我有一个Pandas数据框架df，它看起来像这样(还有其他列):

<表类> 时间戳玩家事件 location_x location_y location_z dist tbody><<tr>2021-07-14 22:54:28.001000鲍勃"PlayerMoveEvent">101010?2021-07-14 22:54:28.001600爱丽丝"PlayerJoinEvent">南南南?2021-07-14 22:54:28.001600爱丽丝"PlayerMoveEvent">202020?2021-07-14 22:54:28.001670鲍勃"PlayerMoveEvent">111010?2021-07-14 22:54:28.001740前夕"PlayerMoveEvent">5159?2021-07-14 22:54:28.001670前夕"PlayerQuitEvent">南南南?2021-07-14 22:54:28.001820爱丽丝"PlayerMoveEvent">182019?

您当前的尝试感觉非常必要，但我发现将Pandas更像一种声明性语言通常会有所帮助。

下面是我处理问题语句的方法(注意我还没有测试过这段代码):

# filter down to move events for just these two players
df_important_events = df.loc[(df['player'].isin([player_0, player_1])) & (df['event'] == 'PlayerMoveEvent')]
# make sure events are timestamp-ordered for the next step
df_important_events = df_important_events.sort_values(['timestamp'])
# filter down to the latest move event per player
df_latest_move_events = df_important_events.groupby('player').last().reset_index(drop=True)
if len(df_latest_move_events) != 2:
# handle the case where the user passed in an invalid player, or one of the players has not moved yet
# filter down to just location columns
df_locations = df_latest_move_events[['location_x', 'location_y', 'location_z']]
# convert to numpy array of (x, y, z) tuples
locations_array = df_locations.to_records()
# return euclidean distance between the two locations
return numpy.linalg.norm(locations_array[1] - locations_array[0])

相关内容

最新更新

热门标签：