从CSV文件中删除包含某些字符的行



如果csv文件中的行包含特定字符串或在其行中,我希望将其删除。

我希望能够创建一个新的输出文件,而不是覆盖原始文件。

我需要删除任何包含";py板";或";咖啡";

示例输入:

173.20.1.1,2-base
174.28.2.2,2-game
174.27.3.109,xyz-b13-coffee-2
174.28.32.8,2-play
175.31.4.4,xyz-102-o1-py-board
176.32.3.129,xyz-b2-coffee-1
177.18.2.8,six-jump-walk

预期输出:

173.20.1.1,2-base
174.28.2.2,2-game
174.28.32.8,2-play
177.18.2.8,six-jump-walk

我试过这个用Python删除CSV文件中的行

import csv
with open('input_csv_file.csv', 'rb') as inp, open('purged_csv_file', 'wb') as out:
writer = csv.writer(out)
for row in csv.reader(inp):
if row[1] != "py-board" or if row[1] != "coffee":
writer.writerow(row)

我试过这个

import csv
with open('input_csv_file.csv', 'rb') as inp, open('purged_csv_file', 'wb') as out:
writer = csv.writer(out)
for row in csv.reader(inp):
if row[1] != "py-board":
if row[1] != "coffee":
writer.writerow(row)

和这个

if row[1][-8:] != "py-board":
if row[1][-8:] != "coffee-1":
if row[1][-8:] != "coffee-2":

但是得到了这个错误

File "C:testingsyslogyamlclean.py", line 6, in <module>
for row in csv.reader(inp):
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)

我实际上不会使用csv包来实现这个目标。这可以使用标准的文件读取和写入轻松实现。

试试这个代码(我写了一些注释,使其不言自明(:

# We open the source file and get its lines
with open('input_csv_file.csv', 'r') as inp:
lines = inp.readlines()
# We open the target file in write-mode
with open('purged_csv_file.csv', 'w') as out:
# We go line by line writing in the target file
# if the original line does not include the
# strings 'py-board' or 'coffee'
for line in lines:
if not 'py-board' in line and not 'coffee' in line:
out.write(line)
# pandas helps to read and manipulate .csv file
import pandas as pd
# read .csv file
df = pd.read_csv('input_csv_file.csv', sep=',', header=None)
df
0                    1
0    173.20.1.1               2-base
1    174.28.2.2               2-game
2  174.27.3.109     xyz-b13-coffee-2
3   174.28.32.8               2-play
4    175.31.4.4  xyz-102-o1-py-board
5  176.32.3.129      xyz-b2-coffee-1
6    177.18.2.8        six-jump-walk
# filter rows
result = df[np.logical_not(df[1].str.contains('py-board') | df[1].str.contains('coffee'))]
print(result)
0              1
0   173.20.1.1         2-base
1   174.28.2.2         2-game
3  174.28.32.8         2-play
6   177.18.2.8  six-jump-walk
# save to result.csv file
result.to_csv('result.csv', index=False, header=False)

相关内容

  • 没有找到相关文章

最新更新