这两个是相关的数据集,但来自独立的json文件,所以我想合并它们。它们可以在索引上匹配,但我并没有真正找到一个好的方法来做到这一点:)
字典列表1:
[
{'index': 217, 'name': 'Battery'}
{'index': 218, 'name': 'Fluffy'}
{'index': 219, 'name': 'Dazzling'}
{'index': 220, 'name': 'Soul-Heart'}
]
字典列表2:
[
{'index': 217, 'desc': 'Text info 2'}
{'index': 218, 'desc': 'will be very informative'}
{'index': 219, 'desc': 'dont know what else i could write here'}
{'index': 220, 'desc': 'Boosts my wallet'}
]
结果应该是这样的:
[
{'index': 217, 'name': 'Battery', 'desc': 'Text info 2'}
{'index': 218, 'name': 'Fluffy', 'desc': 'will be very informative'}
{'index': 219, 'name': 'Dazzling', 'desc': 'dont know what else i could write here'}
{'index': 220, 'name': 'Soul-Heart', 'desc': 'Boosts my wallet'}
]
有更多的数据,但只要我知道如何合并,我想我可以做剩下的
Pandas处理合并就像微风。
首先将数据转换为数据帧:
import pandas as pd
data1 = [
{'index': 217, 'name': 'Battery'},
{'index': 218, 'name': 'Fluffy'},
{'index': 219, 'name': 'Dazzling'},
{'index': 220, 'name': 'Soul-Heart'},
]
data2 = [
{'index': 217, 'desc': 'Text info 2'},
{'index': 218, 'desc': 'will be very informative'},
{'index': 219, 'desc': 'dont know what else i could write here'},
{'index': 220, 'desc': 'Boosts my wallet'},
]
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
然后合并index
列:
df_out = df1.merge(df2, on='index')
它看起来像这样:
index name desc
0 217 Battery Text info 2
1 218 Fluffy will be very informative
2 219 Dazzling dont know what else i could write here
3 220 Soul-Heart Boosts my wallet
- Docs:
pandas.DataFrame.merge()
然后转换回字典列表:
df_out.to_dict(orient='records')
[{'index': 217, 'name': 'Battery', 'desc': 'Text info 2'},
{'index': 218, 'name': 'Fluffy', 'desc': 'will be very informative'},
{'index': 219, 'name': 'Dazzling', 'desc': 'dont know what else i could write here'},
{'index': 220, 'name': 'Soul-Heart', 'desc': 'Boosts my wallet'}]
- Docs:
pandas.DataFrame.to_dict()
要合并Python中具有公共键值对的两个字典,可以在其中一个字典上使用update()方法。此方法将用第二个字典中的值覆盖公共键值对。
dict1.update(dict2)
这应该给你预期的结果,但如果公共键值对不同,它将从第二个字典中选择,第一个字典值将被覆盖。
我假设index
键中的值在每个列表中是唯一的:
lst1 = [
{"index": 217, "name": "Battery"},
{"index": 218, "name": "Fluffy"},
{"index": 219, "name": "Dazzling"},
{"index": 220, "name": "Soul-Heart"},
]
lst2 = [
{"index": 217, "desc": "Text info 2"},
{"index": 218, "desc": "will be very informative"},
{"index": 219, "desc": "dont know what else i could write here"},
{"index": 220, "desc": "Boosts my wallet"},
]
tmp1 = {d["index"]: d["name"] for d in lst1}
tmp2 = {d["index"]: d["desc"] for d in lst2}
out = []
for k in tmp1.keys() & tmp2.keys():
out.append(
{"index": k, "name": tmp1.get(k, "N/A"), "desc": tmp2.get(k, "N/A")}
)
print(out)
打印:
[
{"index": 217, "name": "Battery", "desc": "Text info 2"},
{"index": 218, "name": "Fluffy", "desc": "will be very informative"},
{
"index": 219,
"name": "Dazzling",
"desc": "dont know what else i could write here",
},
{"index": 220, "name": "Soul-Heart", "desc": "Boosts my wallet"},
]
这将工作,但它是非常低效的,因为您必须遍历每个列表的每个字典。如果你能改变数据结构就更好了,这样你就不用遍历列表中的字典了。看看是否有可能改变它,这样你就有了…
dicts2 = {
217 : {'desc': 'Text info 2'},
218 : {'desc': 'will be very informative'},
219 : {'desc': 'dont know what else i could write here'},
220 : {'desc': 'Boosts my wallet'}
}
这种方式可以利用dict作为数据结构的优势,查找所需的项,而不是遍历每个项(就像使用列表一样)
但这是你现在的解决方案:
dicts1 = [
{'index': 217, 'name': 'Battery'},
{'index': 218, 'name': 'Fluffy'},
{'index': 219, 'name': 'Dazzling'},
{'index': 220, 'name': 'Soul-Heart'}
]
dicts2 = [
{'index': 217, 'desc': 'Text info 2'},
{'index': 218, 'desc': 'will be very informative'},
{'index': 219, 'desc': 'dont know what else i could write here'},
{'index': 220, 'desc': 'Boosts my wallet'}
]
for d1 in dicts1:
for d2 in dicts2:
if d1['index'] == d2['index']:
for key, value in d1.items():
d2[key] = value
print(dicts2)
--------------------------------------------------------------
[
{'index': 217, 'desc': 'Text info 2', 'name': 'Battery'},
{'index': 218, 'desc': 'will be very informative', 'name': 'Fluffy'},
{'index': 219, 'desc': 'dont know what else i could write here', 'name': 'Dazzling'},
{'index': 220, 'desc': 'Boosts my wallet', 'name': 'Soul-Heart'}
]
要对列表进行合并,请使用以下命令:
a = [{'index': 217, 'name': 'Battery'},{'index': 218, 'name': 'Fluffy'},{'index': 219, 'name': 'Dazzling'}, {'index': 220, 'name': 'Soul-Heart'}]
b = [{'index': 217, 'desc': 'Text info 2'},{'index': 218, 'desc': 'will be very informative'},{'index': 219, 'desc': 'dont know what else i could write here'},{'index': 220, 'desc': 'Boosts my wallet'}]
final_lst = []
for first, second in zip(a,b):
first.update(second)
final_lst.append(first)
print(final_lst)