Python删除前面包含特定字符的行



如何从txt文件中删除以"gt"?

例如,在txt文件中,大约有250k+行,如果我使用下面的代码,将需要相当长的时间。

data = ""
with open(fileName) as f:
for line in f:
if ">" not in line:
line = line.replace("n", "")
data += line

txt文件的一个例子是:

> version 1.0125 revision 0... # This is the line to be removed
some random line 1
some random line 2
> version 1.0126 revision 0... # This is the line to be removed
...

我尝试过使用data = f.read(),它是即时的,但数据将包含以"开头的行>";。

感谢您的帮助。谢谢:(

不知道之后要对数据做什么,这应该是快速正确的:

with open(fileName) as f:
data = "".join(line for line in f if not line.startswith(">"))

如果你只想从文件中删除这些行,老实说,我不会在Python中删除,而是直接在你的shell中删除,例如在Linux上:

$ grep -v '^>' original_file.txt >fixed_file.txt

如果你坚持使用Python,请逐行执行:

with open(original_file) as f:
with open(new_file, "w") as g:
for line in f:
if not line.startswith(">"):
g.write(line)

使用两个文件,一个用于读取,另一个用于附加:

with open(fileName, 'r') as f, open(fileName.raplace('.txt', '_1.txt'), 'a+') as df:
for line in f.readlines():
if not line.startswith('>'):
df.write(line)

最新更新