如何使用其他两个DataFrames中任意一个的ID填充空的ID单元格



我有下表,其中缺少一些(但不是全部(用户ID:

项目类型问题答案//tr>
用户ID 项目ID
10 223
NaN 126
14 129 问题

试用numpy.select:

import numpy as np
conditions = [df["user ID"].isnull() & df["item type"].eq("question"),
df["user ID"].isnull() & df["item type"].eq("answer")]
choices = [df["item ID"].map(dict(zip(question["item ID"],question["user ID"]))),
df["item ID"].map(dict(zip(answer["item ID"],question["user ID"])))]
df["user ID"] = np.select(conditions, choices, df["user ID"])
>>> df
user ID  item ID item type
0     10.0      123  question
1     10.0      126    answer
2     14.0      129  question

您可以使用np.where((和merge来获得所需的数据

df['user ID'] = df['user ID'].fillna(0).astype(int)
df_final = pd.merge(left = df, right = answer_df, on = 'item ID', how = 'outer', suffixes = ('', '_right'))
df_final['user ID'] = np.where(df_final['user ID'] == 0, df_final['user ID_right'], df_final['user ID']).astype(int)
df_final[['user ID', 'item ID', 'item type']]

最新更新