如何在Python中共享两个列表之间的排序



我得到了一个id列表和一个日期列表。两者都是单独panda数据帧列的单个条目。每个日期对应一个id。类似于:

[852634, 727417, 881231]   [2018-05-29, 2015-11-23, 2019-06-26]

如何排序日期(升序或降序,无关紧要(并将相同的排序导出到ID?

想要的结果是:

[727417, 852634, 881231]   [ 2015-11-23, 2018-05-29, 2019-06-26]

提前感谢您的所有建议,Alessandro

Zip。。。

>>> x = [852634, 727417, 881231]
>>> y = ["2018-05-29", "2015-11-23", "2019-06-26"]
>>> list(zip(y, x))
[('2018-05-29', 852634), ('2015-11-23', 727417), ('2019-06-26', 881231)]

排序。。。

>>> sorted(zip(y,x))
[('2015-11-23', 727417), ('2018-05-29', 852634), ('2019-06-26', 881231)]

然后拉开拉链。

>>> [x for _, x in sorted(zip(y,x))]
[727417, 852634, 881231]

这是一个被称为施瓦茨变换的一般技术的例子。您用相应的日期装饰要排序的ID列表,对装饰的列表进行排序,然后从结果中提取(未装饰的(原始值。

使用numpy-

l1_key = np.argsort(l1)
l1_sorted = np.array(l1)[l1_key]
l2_sorted = np.array(l2)[l1_key]

输出

print(l1_sorted)
print(l2_sorted)
[727417 852634 881231]
['2015-11-23' '2018-05-29' '2019-06-26']

如果您已经有了一个数据帧,那么在导出之前在那里.explode()'em和.sort_values()可能会容易得多!

>>> import pandas as pd
>>> df = pd.DataFrame({"ids": [[852634, 727417, 881231], [90,100,110,115]], "dates": [["2018-05-29", "2015-11-23", "2019-06-26"], ["2015-01-01", "2021-01-01", "2020-01-01", "2021-01-01"]]})
>>> df
ids                                             dates
0  [852634, 727417, 881231]              [2018-05-29, 2015-11-23, 2019-06-26]
1       [90, 100, 110, 115]  [2015-01-01, 2021-01-01, 2020-01-01, 2021-01-01]
>>> df.explode(["ids", "dates"]).sort_values("dates")
ids       dates
1      90  2015-01-01
0  727417  2015-11-23
0  852634  2018-05-29
0  881231  2019-06-26
1     110  2020-01-01
1     100  2021-01-01
1     115  2021-01-01
>>> df.explode(["ids", "dates"]).sort_values("dates")["ids"].to_numpy()
array([90, 727417, 852634, 881231, 110, 100, 115], dtype=object)

您可以为int 执行list.sort()

ids = ids.sort()

您可以使用datetime来执行排序列表:

from datetime import datetime
dates = ["2018-05-29", "2015-11-23", "2019-06-26"]
dates = [datetime.strptime(date,"%Y-%M-%d") for date in dates]
dates.sort()
print(dates)

因此代码变为:

from datetime import datetime
ids = [852634, 727417, 881231]
dates = ["2018-05-29", "2015-11-23", "2019-06-26"]
print("Before: ",ids,dates)
ids = ids.sort()
dates = [datetime.strptime(date,"%Y-%M-%d") for date in dates]
dates.sort()
print("After: ",ids,dates)

最新更新