如何在没有pandas库的情况下删除csv文件中的特定列



我正在尝试删除csv文件中的特定列。

CSV文件:

Name,Age,YearofService,Department,Allocation
Birla,49,12,Welding,Production
Robin,38,10,Molding,Production

我正试图删除具有列标题的"整列";部门";以及";分配";。

我的代码:

with open(input.csv,'r') as i:
with open(output.csv,'w',new line='') as o:
reader=csv.reader(i)
writer = csv.writer(o)
for row in reader:
for i in range(len(row)):
if row[i]!="Department" and row[i]!="Allocation":
writer.writerow(row)

我的输出:

Name
Birla
Robin
Age
49
38
YearofService
12
10

预期输出:

Name,Age,YearofService
Birla,49,12
Robin,38,10

我们不能保证部门和分配将处于列标题位置";3〃;以及";4〃;。那就是我在使用通过行长度的迭代

在这种情况下,csv.DictReadercsv.DictWriter类非常方便:

import csv
with open("input.csv") as instream, open("output.csv", "w") as outstream:
# Setup the input
reader = csv.DictReader(instream)
# Setup the output fields
output_fields = reader.fieldnames
output_fields.remove("Department")
output_fields.remove("Allocation")
# Setup the output
writer = csv.DictWriter(
outstream,
fieldnames=output_fields,
extrasaction="ignore",  # Ignore extra dictionary keys/values
)
# Write to the output
writer.writeheader()
writer.writerows(reader)

票据

  • 对于输入,每一行都将是一个字典,如

    {'Name': 'Birla', 'Age': '49', 'YearofService': '12', 'Department': 'Welding', 'Allocation': 'Production'}
    
  • 对于输出,我们删除那些不需要的列(字段(,请参阅output_fields

  • extraaction参数告诉DictReader忽略字典中的额外键/值

更新

为了从CSV文件中删除列,我们需要

  1. 打开输入文件,读取所有行,关闭它
  2. 再次打开它进行书写

这是我从上面的修改的代码

import csv
with open("input.csv") as instream:
# Setup the input
reader = csv.DictReader(instream)
rows = list(reader)
# Setup the output fields
output_fields = reader.fieldnames
output_fields.remove("Department")
output_fields.remove("Allocation")
with open("input.csv", "w") as outstream:
# Setup the output
writer = csv.DictWriter(
outstream,
fieldnames=output_fields,
extrasaction="ignore",  # Ignore extra dictionary keys/values
)
# Write to the output
writer.writeheader()
writer.writerows(rows)

最快、最简单的方法是在excel中打开它,然后删除你想要的列,我知道这不是你想要的,但这是我想到的第一个解决方法。

你可以写这样的东西(但最好使用panda(:

import csv
def delete_cols(file: str, cols_to_delete: list):
cols_to_delete = set(cols_to_delete)
with open(file) as file, open('output.csv', 'w') as output:
reader = list(csv.reader(file))
headers = reader[0]
indexes_to_delete = [idx for idx, elem in enumerate(headers) if elem in cols_to_delete]
result = [[o for idx, o in enumerate(obj) if idx not in indexes_to_delete] for obj in reader]
writer = csv.writer(output)
writer.writerows(result)

delete_cols('data.csv', ['Department', 'Allocation'])

文件output.csv:

Name,Age,YearofService
Birla,49,12
Robin,38,10

最新更新