pandas在一个循环中创建两个文件副本



我有一个数据帧,如下

import numpy as np
import pandas as pd
from numpy.random import default_rng
rng = default_rng(100)
cdf = pd.DataFrame({'Id':[1,2,3,4,5],
'customer': rng.choice(list('ACD'),size=(5)),
'region': rng.choice(list('PQRS'),size=(5)),
'dumeel': rng.choice(list('QWER'),size=(5)),
'dumma': rng.choice((1234),size=(5)),
'target': rng.choice([0,1],size=(5))
})

我正在尝试基于Customer拆分数据帧,并将其存储在文件夹中。不需要理解完整的代码。问题在最后一行。

i = 0
for k, v in df.groupby(['Customer']):
print(k.split('@')[0])
LM = k.split('@')[0]
i = i+1
unique_cust_names = '_'.join(v['Customer'].astype(str).unique())
unique_ids = '_'.join(v['Id'].astype(str).unique())
unique_location = '_'.join(v['dumeel'].astype(str).unique())
filename = '_'.join([unique_ids, unique_cust_names, unique_location, LM])
print(filename)
with pd.ExcelWriter(f"{filename}.xlsx", engine='xlsxwriter') as writer:
v.to_excel(writer,columns=col_list,index=False)
wb = load_workbook(filename = 'format_sheet.xlsx')
sheet_from =wb.worksheets[0]
wb1 = load_workbook(filename = f"{filename}.xlsx")
sheet_to = wb1.worksheets[0]
copy_styles(sheet_from, sheet_to)
#wb.close()
tab = Table(displayName = "Table1", ref = "A1:" + get_column_letter(wb1.worksheets[0].max_column) + str(wb1.worksheets[0].max_row) )
style = TableStyleInfo(name="TableStyleMedium2", showFirstColumn=False, showLastColumn=False, showRowStripes=True, showColumnStripes=False)
tab.tableStyleInfo = style
wb1.worksheets[0].add_table(tab)
#wb1.worksheets[0].parent.save(f"{filename}.xlsx")
wb1.save("test_files/" + f"{filename}.xlsx")  # issue is here
wb1.close()
print("Total number of customers to be emailed is ", i)

虽然代码运行良好,但问题在下面一行,我想

wb1.save("test_files/" + f"{filename}.xlsx")  # issue is here

这将创建两个文件副本。。一个在当前文件夹中作为jupyter笔记本文件,另一个在test_files文件夹中。

例如:我看到两个名为test1.xlsx的文件,一个在当前文件夹中,另一个在test_files文件夹中(路径为test_files/test1.xlsx(

我该如何避免这种情况?

我希望我的输出只为test_files文件夹中的每个客户生成/保存一个文件?

发生此问题是因为您引用了两个不同的文件名,一个带有前缀"test_files/",另一个没有前缀。处理此问题的最佳方法是如下定义文件名

dir_filename = "test_files/" + f"{filename}.xlsx"

然后在以下位置引用

with pd.ExcelWriter(f"{filename}.xlsx", engine='xlsxwriter') as writer:
v.to_excel(writer,columns=col_list,index=False)
##
wb1 = load_workbook(filename = f"{filename}.xlsx")
##
wb1.save("test_files/" + f"{filename}.xlsx") 

希望它能帮助

最新更新