比较两个文件csv,并创建一个具有公共元素的新文件,但编译器报告我ValueError



我最近开始在一个项目中使用python,所以我提前为我的经验不足道歉。我正在处理两个不同的csv文件,但它们都在一个共同的字段中。csv文件包含有关一系列书籍的信息,大小各不相同。文件一有字段"description",文件二没有。合并文件的字段是"isbn"。我的目标是创建一个。csv文件,其中包含具有相同isbn代码的书籍的描述。我的代码是:

import csv
import pandas as pd
dataset_description = '../dataset-books/dataset.csv'
books_mod = '../dataset-books/booksmod.csv'
output_file = '../dataset-books/newdataset.csv'
cols_to_remove = [0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 20, 21, 22, 23, 24, 25, 26, 27]
cols_to_remove = sorted(cols_to_remove, reverse=True)
row_count = 0 # Current amount of rows processed
with open(dataset_description, "r", encoding='Latin1') as source,
open(books_mod, 'r', encoding='Latin1') as source2:
reader = pd.read_csv(source, delimiter=',')
reader2 = pd.read_csv(source2, delimiter=',')
with open(output_file, "w", newline='', encoding='Latin1') as result:
writer = csv.writer(result)
for row, row2 in reader, reader2:
#row[19], row2[6] index column containing the code
if row[19] == row2[6] and row_count != 10001:
for col_index in cols_to_remove:
del row[col_index]
writer.writerow([row_count, row])
row_count += 1
else:
break
source.close()
source2.close()
result.close()

我读取csv文件,定义要删除的列的索引,打开文件进行读取,打开一个文件进行写入,然后尝试只选择具有相同代码的行并删除其他行。最后我把所有的东西都写在一个文件上。在执行时,它会给我带来错误:"ValueError:太多的值无法解压缩(应为2("。求你了,帮帮我!

带有pandas.merge((.的解决方案

确保";关于";合并字段是相同的。在这种情况下,ISBN被读取为float64。

import numpy as np df3=pd.read_csv("text.csv",dtype={'isbn': np.float64})
import pandas as pd
f1=pd.DataFrame({"isbn":[1,2,3,5],"Authors":['A','B','C','D']})
f2=pd.DataFrame({"isbn":[2,3,5],"Description":["Book two","Book Three","Book 4"]})
df=pd.merge(f1,f2,on=['isbn'],indicator=True)

最新更新