Python:比较字符串列表的元素，根据相似性排序

Background

我有两个列表：

list_1 = ['a_1', 'b_2', 'c_3']
list_2 = [ 'g b 2', 'f a 1', 'h c 3']

请注意，列表中字符串元素的格式不同。换句话说，其中一个列表的元素不是另一个列表的子集。

我想

比较列表 1

和 2 的元素，确定列表 2 中的元素与列表 1 相似
然后我想按照与列表 1 相同的顺序将列表 1 排序为['b_2', 'a_1', 'c_3']

<小时 />

现有问题

问题 1：在这里，一个列表的元素在某种程度上与其他列表的元素完全匹配 '2010-01-01 00：00' 和 '2010-01'。但是，就我而言，格式可能不同。
第二季度的类似情况。

还有其他几个问题正在研究列表比较，但其中大多数都比较类似的字符串。

<小时 />

实际列表

list_1 = ['f_Total_water_withdrawal', 
'f_Precipitation', 
'f_Total-_enewable_water_resources', 
'f_Total_exploitable_water_resources',]
list_2 = ['Precipitation',
'Total-renewable-water-resources', 
'Total exploitable water resources', 
'Total water withdrawal']

我相信有一些缺失的信息。尽管如此，对于给定的列表，我们可以设计这种方法：

# 1. Format list 2 to look like list 1
list_2_mod = [s[2:].replace(" ", "_") for s in list_2]
# 2. Filter elements in list 2 not in list 1
list_final = [s for s in list_2_mod if s in list_1]

明智的做法是，给定您的list_1(具有独特的元素，并且所有元素在list_2中都具有明显的等效性(，您只需要第一步。无需排序！list_2一切都已经整理好了。

我可能会误解，但是如果您的列表与您的示例相对应，则可以简单地使用list_2定义list_1：

list_2 = ['g b 2', 'f a 1', 'h c 3']
list_1 = [f"{s[2]}_{s[4]}" for s in list_2]
print(list_1)

输出：

['b_2', 'a_1', 'c_3']

我想我大致了解你想做什么。请参阅以下内容：

list_1 = ['a_1', 'b_2', 'c_3']
list_2 = [ 'g b 2', 'f a 1', 'h c 3']
dict_1 = {item1[0] + ' ' + item1[-1]: item1 for item1 in list_1}
l = [dict_1[item2[2:]] for item2 in list_2 if item2[2:] in dict_1]

Background

现有问题

实际列表

相关内容

最新更新

热门标签：