如何按另一列排序值现有列?



您好,我正在尝试按另一个对pandas中的现有列进行排序。下面是我的pandas数据集

的输出
d_id a  b   c   d   sort_d_id
3    1  1   0   0   53
21   0  0   0   1   32
32   0  1   0   1   32
32   0  1   0   1   3
53   0  0   1   0   21

基本上,我想按sort_d_did排序d_id。我想要的结果是

d_id a  b   c   d   sort_d_id
53   0  0   1   0   53
32   0  1   0   1   32
32   0  1   0   1   32
3    1  1   0   0   3
21   0  0   0   1   21

你能告诉我怎么做吗?

方法如下:

# Get the d_id and sort_d_id columns for easy handling and rearranging
sort_d_id = df["sort_d_id"].tolist()
d_id = df["d_id"].tolist()
# Also get the row indices (i.e. 0 to 4) in order.
indices = list(range(len(d_id)))
# The 1st column of arr will be indices, 2nd column is sort_d_id, 3rd column is d_id.
arr = np.array([indices, sort_d_id, d_id]).T
# Sort columns according to sort_d_id.
arr = arr[np.argsort(arr[:, 1])]
# The order is now the d_id indices mapped to the sort_id_indices
# e.g. d_id[0] will become d_id[3], d_id[1] will become d_id[4], etc.
new_order = arr[:, 0]
# But we don't want that; we want sort_d_id mapped to the d_id indices.
# So we do another reordering and get the new indices.
new_order_zipped = sorted(zip(new_order, indices))
final_order = list(map(list, zip(*new_order_zipped)))[1]
# Reorder by the final indices.
df = df.reindex(final_order)
# Replace the sort_d_id column which got messed up during reordering.
df["sort_d_id"] = sort_d_id
# Reset indices if you want
df = df.reset_index(drop=True)

生成的df现在看起来应该是这样的。我知道这很令人困惑,但它确实能正常工作。

我能想到的得到你想要的输出的唯一方法是:

df_1 = df['sort_d_id'].sort_values(ascending=False, key=lambda x: x.astype('string'))
df_2 = df.drop('sort_d_id', axis=1).sort_values('d_id', ascending=False, key=lambda x: x.astype('string')).reset_index(drop=True)
df = pd.concat([df_1, df_2], axis=1,)[df.columns]
print(df)

输出:

   d_id  a  b  c  d  sort_d_id
0    53  0  0  1  0         53
1    32  0  1  0  1         32
2    32  0  1  0  1         32
3     3  1  1  0  0          3
4    21  0  0  0  1         21

最新更新