您好,我正在尝试按另一个对pandas中的现有列进行排序。下面是我的pandas数据集
的输出d_id a b c d sort_d_id
3 1 1 0 0 53
21 0 0 0 1 32
32 0 1 0 1 32
32 0 1 0 1 3
53 0 0 1 0 21
基本上,我想按sort_d_did排序d_id。我想要的结果是
d_id a b c d sort_d_id
53 0 0 1 0 53
32 0 1 0 1 32
32 0 1 0 1 32
3 1 1 0 0 3
21 0 0 0 1 21
你能告诉我怎么做吗?
方法如下:
# Get the d_id and sort_d_id columns for easy handling and rearranging
sort_d_id = df["sort_d_id"].tolist()
d_id = df["d_id"].tolist()
# Also get the row indices (i.e. 0 to 4) in order.
indices = list(range(len(d_id)))
# The 1st column of arr will be indices, 2nd column is sort_d_id, 3rd column is d_id.
arr = np.array([indices, sort_d_id, d_id]).T
# Sort columns according to sort_d_id.
arr = arr[np.argsort(arr[:, 1])]
# The order is now the d_id indices mapped to the sort_id_indices
# e.g. d_id[0] will become d_id[3], d_id[1] will become d_id[4], etc.
new_order = arr[:, 0]
# But we don't want that; we want sort_d_id mapped to the d_id indices.
# So we do another reordering and get the new indices.
new_order_zipped = sorted(zip(new_order, indices))
final_order = list(map(list, zip(*new_order_zipped)))[1]
# Reorder by the final indices.
df = df.reindex(final_order)
# Replace the sort_d_id column which got messed up during reordering.
df["sort_d_id"] = sort_d_id
# Reset indices if you want
df = df.reset_index(drop=True)
生成的df
现在看起来应该是这样的。我知道这很令人困惑,但它确实能正常工作。
我能想到的得到你想要的输出的唯一方法是:
df_1 = df['sort_d_id'].sort_values(ascending=False, key=lambda x: x.astype('string'))
df_2 = df.drop('sort_d_id', axis=1).sort_values('d_id', ascending=False, key=lambda x: x.astype('string')).reset_index(drop=True)
df = pd.concat([df_1, df_2], axis=1,)[df.columns]
print(df)
输出: d_id a b c d sort_d_id
0 53 0 0 1 0 53
1 32 0 1 0 1 32
2 32 0 1 0 1 32
3 3 1 1 0 0 3
4 21 0 0 0 1 21