我有一个文件列表,我想检测它们是否存在于子目录中,我已经很接近了,但我被困在最后一步(第5步)。
步骤
- 从提供的文本文件中获取文件名
- 将文件名保存为列表
- 遍历之前保存的文件名列表
- 遍历目录和子目录以确定文件是否存在
- 保存第二个列表中找到的文件名
提供的文本文件有一个列表,例如:
- testfile1.txt
- testfile2.txt
- testfile3.txt
- testfile4.txt
- testfile5.txt
,其中(子)目录中实际上只有testfile1-4。
期望输出是一个列表,例如['testfile1.txt', 'testfile2.txt', 'testfile3.txt', 'testfile4.txt']。
import os.path
from os import path
import sys
file = sys.argv[1]
#top_dir = sys.argv[2]
cwd = os.getcwd()
with open(file, "r") as f: #Step 1
file_list = []
for line in f:
file_name = line.strip()
file_list.append(file_name) #Step 2
print(file_list)
for file in file_list: #Step 3
detected_files = []
for dir, sub_dirs, files in os.walk(cwd): #Step 4
if file in files:
print(file)
print("Files Found")
detected_files.append(file) #Step 5
print(detected_files)
import os.path
from os import path
import sys
file = sys.argv[1]
#top_dir = sys.argv[2]
cwd = os.getcwd()
with open(file, "r") as f: #Step 1
file_list = []
for line in f:
file_name = line.strip()
file_list.append(file_name) #Step 2
print(file_list)
for file in file_list: #Step 3
detected_files = []
for dir, sub_dirs, files in os.walk(cwd): #Step 4
if file in files:
print(file)
print("Files Found")
detected_files.append(file) #Step 5
print(detected_files)
打印结果:
Files Found
testfile1.txt
['testfile1.txt']
Files Found
testfile2.txt
['testfile2.txt']
Files Found
testfile3.txt
['testfile3.txt']
Files Found
testfile4.txt
['testfile4.txt']
您当前的进程如下所示
with open(file, "r") as f: #Step 1
...
for file in file_list: #Step 3
detected_files = []
...
for dir, sub_dirs, files in os.walk(cwd): #Step 4
...
你可以看到在上每个for file in file_list:
的迭代生成一个新的空detected_files
列表-丢失之前保存的所有信息。
detected_files
应做一次
detected_files = []
with open(file, "r") as f: #Step 1
...
for file in file_list: #Step 3
...
for dir, sub_dirs, files in os.walk(cwd): #Step 4
...
我将使用一个集合进行成员测试,并将所有找到的文件名保存在一个集合中(以避免重复)。
detected_files = set()
with open(file, "r") as f: #Step 1
file_list = set(line.strip() for line in f)
for dir, sub_dirs, files in os.walk(cwd): #Step 4
found = file_list.intersection(files)
detected_files.update(found)
如果你想,你可以短路如果所有文件都被找到。
for dir, sub_dirs, files in os.walk(cwd): #Step 4
found = file_list.intersection(files)
detected_files.update(found)
if detected_files == file_list: break