Python追加多个Excel文件



我正试图将多个具有相同列的Excel文件附加到一个文件中。如果我使用这个代码x.append(y, ignore_index = True),它就不起作用。在for循环结束时,它只返回x的表。然而,如果我试图在单个代码块中运行x.append(y, ignore_index = True),那么在for循环后附加仍在内存中的y就可以了。我正在使用Juptyer笔记本。

# import required module
import os as os
import pandas as pd
# assign directory
# directory = 'C:\Users\Tomas\Documents\Python Scripts\csv\TimeLogs'
directory = 'C:\Users\Tomas\Documents\Python Scripts\csv\tmp'

# iterate over files in
# that directory
for idx,filename in enumerate(os.listdir(directory)):
f = os.path.join(directory, filename)
# checking if it is a file
if os.path.isfile(f):
print(f)
print(idx)
if idx == 0:
x = pd.read_excel(f,engine="openpyxl")
else:
y = pd.read_excel(f,engine="openpyxl")
x.append(y, ignore_index = True)

您可以创建一个数据帧列表,然后使用panda的concat方法将它们连接起来。

# import required module
import os as os
import pandas as pd
# assign directory
# directory = 'C:\Users\Tomas\Documents\Python Scripts\csv\TimeLogs'
directory = 'C:\Users\Tomas\Documents\Python Scripts\csv\tmp'

# iterate over files in
# that directory
list_of_dataframes=[]
for idx,filename in enumerate(os.listdir(directory)):
f = os.path.join(directory, filename)
# checking if it is a file
if os.path.isfile(f):
print(f)
list_of_dataframes.append(pd.read_excel(f,engine="openpyxl"))
merged_df=pd.concat(list_of_dataframes)

这样,就不必检查索引idx是否等于0。

最新更新