遍历数据框架并从列表中添加项

对于python数据分析还是个新手，还是个新手。

我有一个熊猫数据帧列表(+100)谁的变量被保存到一个列表。

然后我将变量以字符串格式保存在另一个列表中，以便在绘制时将其添加到dataframe中作为标识符。

我已经定义了一个函数来为以后的特征工程准备表格。

我想遍历每个数据帧，并将相应的字符串添加到名为" string ">

的列中

df = [df1, df2, df3]
strings = ['df1', 'df2', 'df3']

def mindex(df):
# remove time index and insert Strings column 
df.reset_index(inplace=True)
df.insert(1, "Strings", "")
# iterate through each table adding the string values 
for item in enumerate(df):
for item2 in strings:
df['Strings'] = item2

# the loop to cycle through all the dateframes using the function above
for i in df:
mindex(i)

当我使用上面的函数时，它只将最后一个值填充到所有数据帧中。我想要注意的是，所有的数据帧都在相同的日期范围内，因为我已经尝试使用这种方法来停止迭代，但没有成功。

谁能给我指个正确的方向?到目前为止，谷歌还不是我的朋友

df = [df1, df2, df3]
strings = ['df1', 'df2', 'df3']
for s, d in zip(strings, df):
d['Strings'] = s

在df['Strings'] = item2行中，您将变量item2分配到整个列df[" string "]。第一次迭代赋值& df1"，第二次赋值&;df2"并以"df3"结尾。这就是你最后看到的。

如果你想让列字符串完全由"df1"填充对于df1，"df2";对于df2等，您必须:

def mindex(dfs: list, strings: list) -> list:
final_dfs = []
for single_df, df_name in zip(dfs, strings):
single_df = single_df.copy()
single_df.reset_index(inplace=True)
single_df.insert(1, "Strings", "")
single_df['Strings'] = df_name
final_dfs.append(single_df)
return final_dfs
dfs = [df1, df2, df3]
strings = ['df1', 'df2', 'df3']
result = mindex(dfs, strings)

几个外卖:

如果您定义了dfs列表，请将其命名为dfs(复数)，而不是df。

dfs = [df1, df2, df3]

如果您遍历pandas DataFrame，请使用df.iterrows()。它将生成索引和行，因此您不需要应用enumerate。

for idx, row in df.iterrows():
....

如果你在for循环中使用不打算使用的变量，就像你的例子item，用下划线代替。对于无用的变量

for _ in enumerate(df):
for item2 in strings:
df['Strings'] = item2

相关内容

最新更新

热门标签：