如何在OS中使用发电机查找功能,例如包装器



我在python中具有一个函数,其功能为 find命令。因此,基本上它将进入深度,直到击中m_depth (MAXDEPTH),并且如果在ignore_dirs中指定的目录,则不会进入该目录。它将返回walk中找到的文件列表。该代码非常简单,并使用递归。

但是,对于大量文件或更大的深度,递归花费时间,返回时列表越来越大。因此,我正在寻找是否可以使用发电机,因此至少每次迭代的内存消耗少?

我尝试了yield结果,但是每当找到ignore_dirs时,它就会退出。

这是我拥有的代码:

def find(source_d, m_depth, ignore_dirs):
    '''
    This method does a recursive listing of files/directories from a given 
    path upto maximun recursion value provide as m_depth.
    :param source_d: Given source path to start the recursion from
    :param m_depth: Maximum recursion depth [determines how deep the method will traverse through the file system]
    :param ignore_dirs: this paths will not be traversed. List of strings. 
    '''
    def helper_find(path, ignore_dirs, m_depth, curr_depth=1):
        files = []
        if any(ignore_sub_dir == os.path.split(path)[-1] for ignore_sub_dir in ignore_dirs):
            return []
        if m_depth < curr_depth:
            return []
        else:
            things = os.listdir(path)
            for thing in things:
                if(os.path.isdir(os.path.join(path, thing))):
                    files.extend(helper_find(os.path.join(path, thing), ignore_dirs, m_depth, curr_depth+1))
                else:
                    files.append(os.path.join(path, thing))
        return files
    return helper_find(source_d, ignore_dirs, m_depth)

答案是肯定的,您可以使用yield from(仅在Python 3中使用):

def find(source_d, m_depth, ignore_dirs):
    '''
    This method does a recursive listing of files/directories from a given
    path upto maximun recursion value provide as m_depth.
    :param source_d: Given source path to start the recursion from
    :param m_depth: Maximum recursion depth [determines how deep the method will traverse through the file system]
    :param ignore_dirs: this paths will not be traversed. List of strings.
    '''
    def helper_find(path, ignore_dirs, m_depth, curr_depth=1):
        if not any(ignore_sub_dir == os.path.split(path)[-1] for ignore_sub_dir in ignore_dirs)and m_depth >= curr_depth:
            things = os.listdir(path)
            for thing in things:
                if(os.path.isdir(os.path.join(path, thing))):
                    yield from helper_find(os.path.join(path, thing), ignore_dirs, m_depth, curr_depth+1)
                else:
                    yield os.path.join(path, thing)
    return helper_find(source_d, ignore_dirs, m_depth)

最新更新