遍历/迭代任意深度的嵌套字典(字典表示目录树)


写这篇文章时

Python新手。

之所以出现,是因为我希望用户能够从目录(以及任何子目录)中选择一组文件,不幸的是,Tkinter在文件对话框中选择多个文件的默认功能在Windows 7上被破坏了(http://bugs.python.org/issue8010)。

所以我尝试用另一种方法(仍然使用Tkinter)来表示目录结构:构造一个目录结构的副本,由标记和缩进的复选框组成(以树状结构组织)。目录如下:

SomeRootDirectory
    foo.txt
    bar.txt
    Stories
        Horror
            scary.txt
            Trash
                notscary.txt
        Cyberpunk
    Poems
        doyoureadme.txt

看起来像这样(其中#代表一个复选按钮):

SomeRootDirectory
    # foo.txt
    # bar.txt
    Stories
        Horror
            # scary.txt
            Trash
                # notscary.txt
        Cyberpunk
    Poems
        # doyoureadme.txt

根据目录结构构建原始字典很容易,使用我在ActiveState上找到的某个配方(见下文),但是当我试图遍历剩下的嵌套良好的字典时,我遇到了瓶颈。

这是一个打印所有文件名的函数。它遍历字典中的所有键,如果它们映射到不是字典的东西(在你的例子中是文件名),我们打印出名称。否则,在映射到的字典上调用函数。

def print_all_files(directory):
    for filename in directory.keys():
        if not isinstance(directory[filename], dict):
            print filename
        else:
            print_all_files(directory[filename])

所以这段代码可以修改为你想做的任何事情,但它只是一个例子,如何避免通过使用递归来固定深度。

要理解的关键是,每次调用print_all_files时,它不知道它在树中的深度。它只是查看那里的文件,并打印出名称。如果有目录,它就在上面运行。

这是一个初步代码。看一遍,告诉我你遇到的问题在哪里。

Parents={-1:"Root"}
def add_dir(level, parent, index, k):
    print "Directory"
    print "Level=%d, Parent=%s, Index=%d, value=%s" % (level, Parents[parent], index, k)
def add_file(parent, index, k):
    print "File"
    print "Parent=%s, Index=%d, value=%s" %  (Parents[parent], index, k)
def f(level=0, parent=-1, index=0, di={}):
    for k in di:
        index +=1
        if di[k]:
            Parents[index]=k
            add_dir(level, parent, index, k)
            f(level+1, index, index, di[k])
        else:
            add_file(parent, index, k)
a={
    'SomeRootDirectory': {
        'foo.txt': None,
        'bar.txt': None,
        'Stories': {
            'Horror': {
                'scary.txt' : None,
                'Trash' : {
                    'notscary.txt' : None,
                    },
                },
            'Cyberpunk' : None
            },
        'Poems' : {
            'doyoureadme.txt' : None
        }
    }
}
f(di=a)

我知道这是一个老问题,但我只是在寻找一种简单、干净的方法来遍历嵌套字典,这是我有限的搜索中最接近的东西。如果你想要的不仅仅是文件名,那么Oadams的答案就不够有用了,而spicavigo的答案看起来很复杂。

我最终只是滚动我自己的行为类似于os。Walk处理目录,除了它返回所有键/值信息。

它返回一个迭代器,对于"树"中的每个目录对于嵌套字典,迭代器返回(path, subdicts, values),其中:

  • 路径是字典
  • 的路径。
  • subdicts是该字典
  • 中每个子字典的(key,dict)对的元组。
  • values是该字典
  • 中每个(非字典)项的(键,值)对的元组。

def walk(d):
    '''
    Walk a tree (nested dicts).
    
    For each 'path', or dict, in the tree, returns a 3-tuple containing:
    (path, sub-dicts, values)
    
    where:
    * path is the path to the dict
    * sub-dicts is a tuple of (key,dict) pairs for each sub-dict in this dict
    * values is a tuple of (key,value) pairs for each (non-dict) item in this dict
    '''
    # nested dict keys
    nested_keys = tuple(k for k in d.keys() if isinstance(d[k],dict))
    # key/value pairs for non-dicts
    items = tuple((k,d[k]) for k in d.keys() if k not in nested_keys)
    
    # return path, key/sub-dict pairs, and key/value pairs
    yield ('/', [(k,d[k]) for k in nested_keys], items)
    
    # recurse each subdict
    for k in nested_keys:
        for res in walk(d[k]):
            # for each result, stick key in path and pass on
            res = ('/%s' % k + res[0], res[1], res[2])
            yield res
下面是我用来测试它的代码,尽管它有一些其他不相关(但整洁)的东西:
import simplejson as json
from collections import defaultdict
# see https://gist.github.com/2012250
tree = lambda: defaultdict(tree)
def walk(d):
    '''
    Walk a tree (nested dicts).
    
    For each 'path', or dict, in the tree, returns a 3-tuple containing:
    (path, sub-dicts, values)
    
    where:
    * path is the path to the dict
    * sub-dicts is a tuple of (key,dict) pairs for each sub-dict in this dict
    * values is a tuple of (key,value) pairs for each (non-dict) item in this dict
    '''
    # nested dict keys
    nested_keys = tuple(k for k in d.keys() if isinstance(d[k],dict))
    # key/value pairs for non-dicts
    items = tuple((k,d[k]) for k in d.keys() if k not in nested_keys)
    
    # return path, key/sub-dict pairs, and key/value pairs
    yield ('/', [(k,d[k]) for k in nested_keys], items)
    
    # recurse each subdict
    for k in nested_keys:
        for res in walk(d[k]):
            # for each result, stick key in path and pass on
            res = ('/%s' % k + res[0], res[1], res[2])
            yield res
# use fancy tree to store arbitrary nested paths/values
mem = tree()
root = mem['SomeRootDirectory']
root['foo.txt'] = None
root['bar.txt'] = None
root['Stories']['Horror']['scary.txt'] = None
root['Stories']['Horror']['Trash']['notscary.txt'] = None
root['Stories']['Cyberpunk']
root['Poems']['doyoureadme.txt'] = None
# convert to json string
s = json.dumps(mem, indent=2)
#print mem
print s
print
# json.loads converts to nested dicts, need to walk them
for (path, dicts, items) in walk(json.loads(s)):
    # this will print every path
    print '[%s]' % path
    for key,val in items:
        # this will print every key,value pair (skips empty paths)
        print '%s = %s' % (path+key,val)
    print

输出如下:

{
  "SomeRootDirectory": {
    "foo.txt": null,
    "Stories": {
      "Horror": {
        "scary.txt": null,
        "Trash": {
          "notscary.txt": null
        }
      },
      "Cyberpunk": {}
    },
    "Poems": {
      "doyoureadme.txt": null
    },
    "bar.txt": null
  }
}
[/]
[/SomeRootDirectory/]
/SomeRootDirectory/foo.txt = None
/SomeRootDirectory/bar.txt = None
[/SomeRootDirectory/Stories/]
[/SomeRootDirectory/Stories/Horror/]
/SomeRootDirectory/Stories/Horror/scary.txt = None
[/SomeRootDirectory/Stories/Horror/Trash/]
/SomeRootDirectory/Stories/Horror/Trash/notscary.txt = None
[/SomeRootDirectory/Stories/Cyberpunk/]
[/SomeRootDirectory/Poems/]
/SomeRootDirectory/Poems/doyoureadme.txt = None

可以使用递归遍历嵌套字典

def walk_dict(dictionary):
    for key in dictionary:
        if isinstance(dictionary[key], dict):
           walk_dict(dictionary[key])
        else:
           #do something with dictionary[k]
           pass

希望有帮助

a={
    'SomeRootDirectory': {
        'foo.txt': None,
        'bar.txt': None,
        'Stories': {
            'Horror': {
                'scary.txt' : None,
                'Trash' : {
                    'notscary.txt' : None,
                    },
                },
            'Cyberpunk' : None
            },
        'Poems' : {
            'doyoureadme.txt' : None
        }
    }
}
def dict_paths(dictionary, level=0, parents=[], paths=[]):
  for key in dictionary:
    parents = parents[0:level]
    paths.append(parents + [key])
    if dictionary[key]:
      parents.append(key)
      dict_paths(dictionary[key], level+1, parents, paths)
  return paths
dp = dict_paths(a)
for p in dp:
    print '/'.join(p)

最新更新