通过for循环连接dataframe



我想通过一个for循环和concat函数组合多个数据帧,并将结果保存在一个名为all_dfs的数据帧中,但不知何故,当for循环运行时,它总是踢出之前在all_dfs中的df。有什么建议我可以解决这个问题吗?

for i in vd_files_list:

### Den Szenario-Namen ohne VD herausfiltern
print(i)
scenario_name_w_vd = i.split("/")[-1]
scenario_name = scenario_name_w_vd.split(".")[0]

try:

VD_filename = r"{}".format(i)
df = pd.read_csv(filepath_or_buffer=VD_filename,
skiprows=(13),
names =("Attribute", "Commodity", "Process", "Period","Region", "Vintage", "TimeSlice", "UserConstraint","PV"),
dtype={"Attribute":str, "Commodity":str, "Process":str, "Period":str,"Region":str, "Vintage":str, "TimeSlice":str, "UserConstraint":str,"PV":float})
#hier wird eine extra Spalte "Szenario" mit dem Szenario-Namen hinzugefügt
df["Szenario"] = scenario_name

all_dfs = pd.concat([df])
print(all_dfs)


您的all_dfs变量的作用域在您的for循环中的是局部的。在循环之前初始化它为一个新的DataFrame,然后在每次迭代时追加它。

all_dfs = pd.DataFrame()
for i in vd_files_list:
### Den Szenario-Namen ohne VD herausfiltern
print(i)
scenario_name_w_vd = i.split("/")[-1]
scenario_name = scenario_name_w_vd.split(".")[0]

try:
VD_filename = r"{}".format(i)
df = pd.read_csv(filepath_or_buffer=VD_filename,
skiprows=(13),
names =("Attribute", "Commodity", "Process", "Period","Region", "Vintage", "TimeSlice", "UserConstraint","PV"),
dtype={"Attribute":str, "Commodity":str, "Process":str, "Period":str,"Region":str, "Vintage":str, "TimeSlice":str, "UserConstraint":str,"PV":float})
#hier wird eine extra Spalte "Szenario" mit dem Szenario-Namen hinzugefügt
df["Szenario"] = scenario_name

all_dfs.append(df)
except:
# what errors do you need to handle?
pass
print(all_dfs)

这几乎解决了问题!我唯一改变的是写all_dfs = all_dfs.append(df)而不是all_dfs.append(df)

工作代码现在看起来是这样的(加上我添加的异常):

all_dfs = pd.DataFrame()
for i in vd_files_list:

### Den Szenario-Namen ohne VD herausfiltern
print(i)
scenario_name_w_vd = i.split("/")[-1]
scenario_name = scenario_name_w_vd.split(".")[0]

try:

VD_filename = r"{}".format(i)
df = pd.read_csv(filepath_or_buffer=VD_filename,
skiprows=(13),
names =("Attribute", "Commodity", "Process", "Period","Region", "Vintage", "TimeSlice", "UserConstraint","PV"),
dtype={"Attribute":str, "Commodity":str, "Process":str, "Period":str,"Region":str, "Vintage":str, "TimeSlice":str, "UserConstraint":str,"PV":float})
#hier wird eine extra Spalte "Szenario" mit dem Szenario-Namen hinzugefügt
df["Szenario"] = scenario_name


all_dfs = all_dfs.append(df)

print(all_dfs)



except ValueError:
tk.messagebox.showerror("Information", "Die ausgewählte Datei ist ungültig")
return None
except FileNotFoundError:
tk.messagebox.showerror("Information", f" Die Datei {file_path} existiert nicht")
return None

相关内容

  • 没有找到相关文章

最新更新