我正在尝试删除csv文件中的特定列。

CSV文件：

Name,Age,YearofService,Department,Allocation
Birla,49,12,Welding,Production
Robin,38,10,Molding,Production

我正试图删除具有列标题的"整列"；部门"；以及"；分配"；。

我的代码：

with open(input.csv,'r') as i:
with open(output.csv,'w',new line='') as o:
reader=csv.reader(i)
writer = csv.writer(o)
for row in reader:
for i in range(len(row)):
if row[i]!="Department" and row[i]!="Allocation":
writer.writerow(row)

我的输出：

Name
Birla
Robin
Age
49
38
YearofService
12
10

预期输出：

Name,Age,YearofService
Birla,49,12
Robin,38,10

我们不能保证部门和分配将处于列标题位置"；3〃；以及"；4〃；。那就是我在使用通过行长度的迭代

在这种情况下，csv.DictReader和csv.DictWriter类非常方便：

import csv
with open("input.csv") as instream, open("output.csv", "w") as outstream:
# Setup the input
reader = csv.DictReader(instream)
# Setup the output fields
output_fields = reader.fieldnames
output_fields.remove("Department")
output_fields.remove("Allocation")
# Setup the output
writer = csv.DictWriter(
outstream,
fieldnames=output_fields,
extrasaction="ignore",  # Ignore extra dictionary keys/values
)
# Write to the output
writer.writeheader()
writer.writerows(reader)

票据

对于输入，每一行都将是一个字典，如

{'Name': 'Birla', 'Age': '49', 'YearofService': '12', 'Department': 'Welding', 'Allocation': 'Production'}

对于输出，我们删除那些不需要的列(字段(，请参阅output_fields
extraaction参数告诉DictReader忽略字典中的额外键/值

更新

为了从CSV文件中删除列，我们需要

打开输入文件，读取所有行，关闭它
再次打开它进行书写

这是我从上面的修改的代码

import csv
with open("input.csv") as instream:
# Setup the input
reader = csv.DictReader(instream)
rows = list(reader)
# Setup the output fields
output_fields = reader.fieldnames
output_fields.remove("Department")
output_fields.remove("Allocation")
with open("input.csv", "w") as outstream:
# Setup the output
writer = csv.DictWriter(
outstream,
fieldnames=output_fields,
extrasaction="ignore",  # Ignore extra dictionary keys/values
)
# Write to the output
writer.writeheader()
writer.writerows(rows)

最快、最简单的方法是在excel中打开它，然后删除你想要的列，我知道这不是你想要的，但这是我想到的第一个解决方法。

你可以写这样的东西(但最好使用panda(：

import csv
def delete_cols(file: str, cols_to_delete: list):
cols_to_delete = set(cols_to_delete)
with open(file) as file, open('output.csv', 'w') as output:
reader = list(csv.reader(file))
headers = reader[0]
indexes_to_delete = [idx for idx, elem in enumerate(headers) if elem in cols_to_delete]
result = [[o for idx, o in enumerate(obj) if idx not in indexes_to_delete] for obj in reader]
writer = csv.writer(output)
writer.writerows(result)

delete_cols('data.csv', ['Department', 'Allocation'])

文件output.csv:

Name,Age,YearofService
Birla,49,12
Robin,38,10

如何在没有pandas库的情况下删除csv文件中的特定列

更新

相关内容

最新更新

热门标签：