我有两个大字典,两个字典有相同的键(图像名称)和不同的值。
第一个名为train_descriptions
的字典,看起来像这样:
{'15970.jpg': 'Turtle Check Men Navy Blue Shirt',
'39386.jpg': 'Peter England Men Party Blue Jeans',
'59263.jpg': 'Titan Women Silver Watch',
....
....
'1855.jpg': 'Inkfruit Mens Chain Reaction T-shirt'}
和第二个名为train_features
的字典
{'31973.jpg': array([[0.00125694, 0. , 0.03409385, ..., 0.00434341, 0.00728011,
0.01451511]], dtype=float32),
'30778.jpg': array([[0.0174035 , 0.04345186, 0.00772929, ..., 0.02230316, 0. ,
0.03104496]], dtype=float32),
...,
...,
'38246.jpg': array([[0.00403965, 0.03701203, 0.02616892, ..., 0.02296285, 0.00930257,
0.04575242]], dtype=float32)}
两个字典的长度如下:
len(train_descriptions)
= 44424,len(train_features)
= 44441
可以看到train_description
的长度小于train_features
的长度。train_features
字典比train_descriptions
字典有更多的键值。如何从train_features
字典中删除不在train_description
中的键?使它们的长度相同
使用xor
获取字典之间的差异
diff = train_features.keys() ^ train_descriptions.keys()
for k in diff:
del train_features[k]
使用for loop
feat = train_features.keys()
desc = train_description.keys()
common = list(i for i in feat if i not in decc)
for i in common: del train_features[i]
编辑:见下文
以上代码可以工作。但是我们可以通过不将dict_keys转换为list来更有效地做到这一点,如下所示:
for i in train_features.keys() - train_description.keys(): del train_features[i]
当python dict_keys被减去时,它会给出不常见键的dict_keys。第一个代码首先转换为列表,这既不高效也不需要。
如果pop()
不存在,则仅为CC_13。
for key in train_descriptions.keys():
if key not in train_features.keys():
train_features.pop(key)