我有一个Python嵌套字典,如下所示:
{'dist_river':
{'high':
{'wind_speed':
{'1':
{'population':
{'high':
{'school':
{'high':'T', 'medium':'T', 'low':'F'}
},
'medium':
{'land_cover':
{'Mix_garden':
{'income_source':
{'Plantation':'T', 'Agriculture':'F'}
}
}
}
}
}
}
},
'low': 'F'
}
}
如何从嵌套词典中获取子词典?。例如,dic:中的子分区
results = [
{'dist_river':
{'high':
{'wind_speed':
{'1':
{'population':
{'high':
{'school':
{'high': 'T', 'medium': 'T', 'low': 'F'}
}}}}}}},
{'dist_river':
{'high':
{'wind_speed':
{'1':
{'population':
{'medium':
{'land_cover':
{'Mix_garden':
{'income_source':
{'Plantation': 'T', 'Agriculture': 'F'}
}}}}}}}}},
{'dist_river':
{'low': 'F'}
}
]
lengths(results) == 3
感谢您的帮助
社区编辑:似乎每个生成的字典对于每个嵌套级别都只能有一个条目。换句话说,每个结果都包含字典树中每个叶子的整个路径Tim Pietzcker 13小时前
import collections
def isDict(d):
return isinstance(d, collections.Mapping)
def isAtomOrFlat(d):
return not isDict(d) or not any(isDict(v) for v in d.values())
def leafPaths(nestedDicts, noDeeper=isAtomOrFlat):
"""
For each leaf in NESTEDDICTS, this yields a
dictionary consisting of only the entries between the root
and the leaf.
"""
for key,value in nestedDicts.items():
if noDeeper(value):
yield {key: value}
else:
for subpath in leafPaths(value):
yield {key: subpath}
演示:
>>> pprint.pprint(list( leafPaths(dic) ))
[{'dist_river': {'high': {'wind_speed': {'1': {'population': {'high': {'school': {'high': 'T',
'low': 'F',
'medium': 'T'}}}}}}}},
{'dist_river': {'high': {'wind_speed': {'1': {'population': {'medium': {'land_cover': {'Mix_garden': {'income_source': {'Agriculture': 'F',
'Plantation': 'T'}}}}}}}}}},
{'dist_river': {'low': 'F'}}]
旁注1:然而,除非出于某种原因需要这种格式,否则我个人认为最好以元组的方式生成节点,例如:
...noDeeper=lambda x:not isDict(x)...
...yield tuple(value)
...yield (key,)+subpath
[('dist_river', 'high', 'wind_speed', '1', 'population', 'high', 'school', 'high', 'T'),
('dist_river', 'high', 'wind_speed', '1', 'population', 'high', 'school', 'medium', 'T'),
('dist_river', 'high', 'wind_speed', '1', 'population', 'high', 'school', 'low', 'F'),
('dist_river', 'high', 'wind_speed', '1', 'population', 'medium', 'land_cover', 'Mix_garden', 'income_source', 'Plantation', 'T'),
('dist_river', 'high', 'wind_speed', '1', 'population', 'medium', 'land_cover', 'Mix_garden', 'income_source', 'Agriculture', 'F'),
('dist_river', 'low', 'F')]
(很容易从"直截了当"的答案中提取,这恰好是第435条的答案。)
旁注2:请注意,OP并不是在寻找天真的实现。天真的实现将具有noDeeper=lambda x:not isDict(x)
,结果为:
>>> pprint.pprint(list( leafPaths(dic) ))
[{'dist_river': {'high': {'wind_speed': {'1': {'population': {'high': {'school': {'high': 'T'}}}}}}}},
{'dist_river': {'high': {'wind_speed': {'1': {'population': {'high': {'school': {'medium': 'T'}}}}}}}},
{'dist_river': {'high': {'wind_speed': {'1': {'population': {'high': {'school': {'low': 'F'}}}}}}}},
{'dist_river': {'high': {'wind_speed': {'1': {'population': {'medium': {'land_cover': {'Mix_garden': {'income_source': {'Plantation': 'T'}}}}}}}}}},
{'dist_river': {'high': {'wind_speed': {'1': {'population': {'medium': {'land_cover': {'Mix_garden': {'income_source': {'Agriculture': 'F'}}}}}}}}}},
{'dist_river': {'low': 'F'}}]
编辑:这是一个低效的算法。每片叶片L被重产CCD_ 2次。更有效的方法是使用自定义数据结构来链接生成器,或者手动模拟堆栈。
也许这个:
def enum_paths(p):
if not hasattr(p, 'items'):
yield p
else:
for k, v in p.items():
for x in enum_paths(v):
yield {k: x}
for x in enum_paths(dic):
print x
这与从字典中"获取"任何其他值的方式完全相同。
print dic1['dist_river']['high']
等等
编辑:
如果我误解了这个问题,并且它实际上是关于一次获得所有dict的列表,这里有一个例子,每个dict中只有一个密钥:
def get_nested_dicts(d):
dicts = []
probe = d
while type(probe) == dict:
dicts.append(probe)
probe = probe.values()[0]
return dicts