我需要从数据库编辑200+ Excel文件,不幸的是数据库以一种奇怪的方式提供数据,这意味着我需要删除前2行,我需要重命名20个变量名称。其中19保持不变,但1变量名称(始终在同一位置)在excel表中正在改变。这个变化变量需要被命名为FY-0,名称更改为FY-1…
import os
import pandas as pd
in_path = 'ESG_untransfored'
out_path = 'ESG_transformed'
if not os.path.exists(out_path):
os.mkdir(out_path)
for filename in os.listdir(in_path):
if filename.endswith('.xlsx'):
df = pd.read_excel(os.path.join(in_path, filename))
df = df.iloc[2:]
duf=duf.rename(columns = {'Unnamed: 29':'FY-1'})
duf=duf.rename(columns = {'Unnamed: 30':'FY-2'})
duf=duf.rename(columns = {'Unnamed: 31':'FY-3'})
duf=duf.rename(columns = {'Unnamed: 32':'FY-4'})
duf=duf.rename(columns = {'Unnamed: 33':'FY-5'})
duf=duf.rename(columns = {'Unnamed: 34':'FY-6'})
duf=duf.rename(columns = {'Unnamed: 35':'FY-7'})
duf=duf.rename(columns = {'Unnamed: 36':'FY-8'})
duf=duf.rename(columns = {'Unnamed: 37':'FY-9'})
duf=duf.rename(columns = {'Unnamed: 38':'FY-10'})
duf=duf.rename(columns = {'Unnamed: 39':'FY-11'})
duf=duf.rename(columns = {'Unnamed: 40':'FY-12'})
duf=duf.rename(columns = {'Unnamed: 41':'FY-13'})
duf=duf.rename(columns = {'Unnamed: 42':'FY-14'})
duf=duf.rename(columns = {'Unnamed: 43':'FY-15'})
duf=duf.rename(columns = {'Unnamed: 44':'FY-16'})
duf=duf.rename(columns = {'Unnamed: 45':'FY-17'})
duf=duf.rename(columns = {'Unnamed: 46':'FY-18'})
duf=duf.rename(columns = {'Unnamed: 47':'FY-19'})
duf=duf.rename(columns = {'Unnamed: 48':'FY-20'})
df.to_excel(os.path.join(out_path, filename), index=False)
假设我正确理解了您的问题(并且duf
应该是df
),您可以这样做:
df = pd.read_excel(os.path.join(in_path, filename))
df = df.iloc[2:]
cols = list(df.columns)
cols[0] = "FY-0"
df.columns = cols
这将始终将第一列的名称更改为"FY-0"
。不太优雅,但应该能达到目的。
(您必须使用list(df.columns)
的原因是数据帧的columns属性是不可变的,但列表不是)。