我有一个Pandas数据框架df,它看起来像这样(还有其他列):
<表类>
时间戳
玩家
事件
location_x
location_y
location_z
dist
tbody><<tr>2021-07-14 22:54:28.001000 鲍勃 "PlayerMoveEvent"> 10 10 10 ? 2021-07-14 22:54:28.001600 爱丽丝 "PlayerJoinEvent"> 南 南 南 ? 2021-07-14 22:54:28.001600 爱丽丝 "PlayerMoveEvent"> 20 20 20 ? 2021-07-14 22:54:28.001670 鲍勃 "PlayerMoveEvent"> 11 10 10 ? 2021-07-14 22:54:28.001740 前夕 "PlayerMoveEvent"> 5 15 9 ? 2021-07-14 22:54:28.001670 前夕 "PlayerQuitEvent"> 南 南 南 ? 2021-07-14 22:54:28.001820 爱丽丝 "PlayerMoveEvent"> 18 20 19 ? 表类>
您当前的尝试感觉非常必要,但我发现将Pandas更像一种声明性语言通常会有所帮助。
下面是我处理问题语句的方法(注意我还没有测试过这段代码):# filter down to move events for just these two players
df_important_events = df.loc[(df['player'].isin([player_0, player_1])) & (df['event'] == 'PlayerMoveEvent')]
# make sure events are timestamp-ordered for the next step
df_important_events = df_important_events.sort_values(['timestamp'])
# filter down to the latest move event per player
df_latest_move_events = df_important_events.groupby('player').last().reset_index(drop=True)
if len(df_latest_move_events) != 2:
# handle the case where the user passed in an invalid player, or one of the players has not moved yet
# filter down to just location columns
df_locations = df_latest_move_events[['location_x', 'location_y', 'location_z']]
# convert to numpy array of (x, y, z) tuples
locations_array = df_locations.to_records()
# return euclidean distance between the two locations
return numpy.linalg.norm(locations_array[1] - locations_array[0])