Python-删除列表中以单词/字符串开头的所有行

我正试图解析一个巨大的50K行文件，在该文件中，我必须删除预定义列表中以单词开头的任何一行。

目前我已经尝试了以下操作，但输出文件(DB12_NEW(没有按要求工作-

rem = ['remove', 'remove1', 'remove2'....., 'removen']
inputFile = open(r"C:file", "r")
outputFile = open(r"C:file_12", "w")
lines = inputFile.readlines()
inputFile.close()
for line in lines:
for i in rem:
if line.startswith(i):
outputFile.write('n')
else:
outputFile.write(line)

我得到的文件与我最初放入的输出文件相同…脚本没有删除以列表中任何字符串开头的行。

你能帮助理解如何实现这一点吗？

对str.startswith使用tuple而不是list。

# rem = ['remove', 'rem-ove', 'rem ove']
rem = ('remove', 'rem-ove', 'rem ove')
with open('DB12', 'r') as inputFile, open('DB12_NEW', 'w') as outputFile:
for line in inputFile.readlines():
if not line.startswith(rem):
outputFile.writelines(line)

当前，您检查该行是否以移除列表中的一个单词开始，每次一个。例如：

如果该行以"；rem f"在循环中，如果该行以"remove"开头，则if语句返回false，并将该行写入outputfile。

你可以试试这样的东西：

remove = ['remove', 'rem-ove', 'rem', 'rem ove' ...... 'n']
inputFile = open(r"C:DB12", "r")
outputFile = open(r"C:DB12_NEW", "w")
for line in inputFile.splitlines():
if not any(line.startswith(i) for i in remove):
outputFile.write(line)

如果所有元素也是False，则any关键字仅返回False。

有时这可能是由前导/尾随空格引起的。

尝试使用strip()剥离空白并进行检查。

rem = [x.strip() for x in rem]
lines = [line.strip() for  line in lines]

相关内容

最新更新

热门标签：