我正在学习Python并编写一个应用程序,该应用程序将递归文件夹树并识别特定扩展名的文件。
test文件夹结构如下,其中有10个文本文件:
C:TEMPROOT
├───dir1
│ │ dir1file1.txt
│ │ dir1file2.txt
│ │
│ ├───subdir1
│ │ dir1subdir1file1.txt
│ │ dir1subdir1file2.txt
│ │
│ └───subdir2
│ dir1subdir2file1.txt
│ dir1subdir2file2.txt
│
└───dir2
│ dir2file1.txt
│ dir2file2.txt
│
└───subdir1
│ dir2subdir1file1.txt
│
└───subdir1
└───subdir1
dir2subdir1subdir1subdir1file1.txt
代码的业务端,提取并简化为:
def scan_for_txt_files(start_from):
for root_path, subdirs, files in os.walk(start_from):
for _ in subdirs:
# In the real application I update a progress bar here.
for this_file in files:
ext = str.lower(os.path.splitext(this_file)[1]).replace('.', '')
if ext == 'txt':
print(f'{os.path.join(root_path, this_file)}')
运行时打印:
c:temprootdir1dir1file1.txt
c:temprootdir1dir1file2.txt
c:temprootdir1dir1file1.txt
c:temprootdir1dir1file2.txt
c:temprootdir2dir2file1.txt
c:temprootdir2dir2file2.txt
c:temprootdir2subdir1dir2subdir1file1.txt
但是,如果我修改代码以删除对subdirs的引用,它会正常工作:
def scan_for_txt_files(start_from):
for root_path, subdirs, files in os.walk(start_from):
for this_file in files:
ext = str.lower(os.path.splitext(this_file)[1]).replace('.', '')
if ext == 'txt':
print(f'{os.path.join(root_path, this_file)}')
输出:
c:temprootdir1dir1file1.txt
c:temprootdir1dir1file2.txt
c:temprootdir1subdir1dir1subdir1file1.txt
c:temprootdir1subdir1dir1subdir1file2.txt
c:temprootdir1subdir2dir1subdir2file1.txt
c:temprootdir1subdir2dir1subdir2file2.txt
c:temprootdir2dir2file1.txt
c:temprootdir2dir2file2.txt
c:temprootdir2subdir1dir2subdir1file1.txt
c:temprootdir2subdir1subdir1subdir1dir2subdir1subdir1subdir1file1.txt
代码的第一种形式是因为我的意图是事先确定子文件夹的数量,然后在'for…在subdirs部分,根据扫描的文件夹数量更新进度条。
这种情况发生在实际文件系统或pytestpyfakefs中。我肯定是很简单的事情,但是我不明白是怎么回事。
问题是我对Python生成器的理解不完整,而且我的缩进也不正确。'for this_file in files'部分应该像'for _ in subdirs'部分一样缩进。
谢谢你的回复。