之间找到共同的价值观两个csv文件



我有两个csv文件(比如a和b),它们都包含不同的数据集。这两个CSV文件之间唯一的共同点是id_no。我想创建一个最终的csv文件,其中包含id_no匹配的两个csv文件的所有数据集。

a看起来像

id_no   a1    a2     a3    a4
1       0.5  0.2    0.1    10.20
2       1.5  0.1    0.2    11.25
3       2.5  0.7    0.3    12.90
4       3.5  0.8    0.4    13.19
5       7.5  0.6    0.3    14.21

b看起来像

id_no   A1         
6       10.1  
8       2.5 
4       12.5  
2       20.5  
1       2.51 

我正在寻找一个最终的csv文件,说c显示以下输出

id_no   a1    a2     a3    a4       A1
1       0.5  0.2    0.1    10.20    2.51
2       1.5  0.1    0.2    11.25    20.5
3       2.5  0.7    0.3    12.90     0
4       3.5  0.8    0.4    13.19    12.5
5       7.5  0.6    0.3    14.21     0

使用pandas.merge:

import pandas as pd
a = pd.read_csv("data1.csv")
b = pd.read_csv("data2.csv")
output = a.merge(b, on="id_no", how="left").fillna(0).set_index("id_no")
output.to_csv("output.csv")
>>> output
a1   a2   a3     a4     A1
id_no                             
1      0.5  0.2  0.1  10.20   2.51
2      1.5  0.1  0.2  11.25  20.50
3      2.5  0.7  0.3  12.90   0.00
4      3.5  0.8  0.4  13.19  12.50
5      7.5  0.6  0.3  14.21   0.00

使用普通的老式python:

from csv import reader, writer
from pathlib import Path

with Path("file2.csv").open as f:
r = reader(f)
header = next(r)
data = {k:v for row in r for k, v in [row]}
rows = []
with Path("file1.csv").open() as f:
r = reader(f)
header.append(next(r)[-1])
for i, *row in r:
if i in data:
rows.append([i] + row + data[i])
else:
rows.append([i] + row + [0])
with Path("file1.csv").open("w") as f:
w = writer(f)
w.write_row(header)
w.write_rows(rows)