根据另一个列表对列表中的项进行分类



我想对项中的元素进行分类基于类别的列表列表。

category = [[1, "grape fruit"], [2, "orange fruit"], [3, "orange cookies"], [4, "pineapple pie"], [5, "strawberry"]]
items = ["grape juice - 1L", "Grape syrup - 500gr", "strawberry cookies 5's -2packs", "orange juice - 500gr", "orange cookies 10's 1 pack", "orange pudding - 1pcs", "pies - 1box"]

这是我想要的结果:

result = [1, 1, 5, 2, 3, 2, 4]

我想知道这是否可以用for循环来完成,或者是否有任何方法对它进行分类。

谢谢

对于这类问题,一个简单的方法是模糊匹配。在python中有许多库可以执行fuzzymatches,因此您不必自己编写一个。

下面给出了一个使用fuzzywuzzy包的例子。

from fuzzywuzzy import process
category = [[1, "grape fruit"], [2, "orange fruit"], [3, "orange cookies"], [4, "pineapple pie"], [5, "strawberry"]]
items = ["grape juice - 1L", "Grape syrup - 500gr", "strawberry cookies 5's -2packs", "orange juice - 500gr",
"orange cookies 10's 1 pack", "orange pudding - 1pcs", "pies - 1box"]
result = []
fuzzy_index = {cat[0]: cat[1] for cat in category}
for item in items:
matcher = " ".join(item.split(" ")[:2])
val = process.extractOne(matcher, fuzzy_index)[2]
result.append(val)
print(result)

结果是:

[1, 1, 5, 2, 3, 2, 4]

这是基于该样本集的方法。您可以根据问题的复杂程度,使用fuzzymatching或nltk包来构建匹配索引。