如何在Python中将具有匹配名称的CSV组合到一个工作簿中



我在两个文件夹中有名称匹配的CSV。它们是唯一的,即对于Folder1中的每个记录,Folder2中可能有一个同名的匹配记录。

我想将具有匹配名称的CSV合并到一个工作簿中,作为工作簿中的工作表。例如,如果Folder1和Folder2中都存在220001.csv,则在Folder3中生成的工作簿将为220001.xlsx。如果Folder2中缺少类似的命名文件,则生成的.xlsx将只包含一个工作表,即Folder1中的记录。Folder1是一种主文件夹。

如何在Python中解决此问题?

使用:

import os, glob
#get filenames from both folders
files1 = glob.glob('Folder1/*.csv')
files2 = glob.glob('Folder2/*.csv')
#create DataFrames
df1 = pd.DataFrame([(*os.path.split(x), x) for x in files1])
df2 = pd.DataFrame([(*os.path.split(x), x) for x in files2])
#join together
df = df1.merge(df2, on=1, how='outer')
#for each row generate DataFrames and write to excel - if both match to 2 sheets
for x in df.itertuples():
file = os.path.basename(x._2)

m1 = pd.notna(x._3)
m2 = pd.notna(x._5)
if m1 and m2:
writer = pd.ExcelWriter(os.path.join('Folder33', x._2.replace('csv','xlsx')))
df1 = pd.read_csv(x._3)
df2 = pd.read_csv(x._5)
df1.to_excel(writer,sheet_name=x._1)
df2.to_excel(writer,sheet_name=x._4)
writer.save()
elif m1:
writer = pd.ExcelWriter(os.path.join('Folder33', x._2.replace('csv','xlsx')))
df = pd.read_csv(x._3)
df.to_excel(writer,sheet_name=x._1)
writer.save()

最新更新