不这样做:
df['A'] = df['A'] if 'A' in df else None
df['B'] = df['B'] if 'B' in df else None
df['C'] = df['C'] if 'C' in df else None
df['D'] = df['D'] if 'D' in df else None
...
我想在一行或一个函数中完成。下面是我尝试的:
def populate_columns(df):
col_names = ['A', 'B', 'C', 'D', 'E', 'F', ...]
def populate_column(df, col_name):
df[col_name] = df[col_name] if col_name in df else None
return df[col_name]
df[col_name] = df.apply(lambda x: populate_column(x) for x in col_names)
return df
我只得到Exception has occurred: ValueError
。我在这里能做什么?
看起来你可以用reindex
代替你的整个代码:
ensure_cols = ['A', 'B', 'C', 'D']
df = df.reindex(columns=df.columns.union(ensure_cols))
NB。默认的填充值是NaN
,如果你真的想要None
,使用fill_value=None
。
如果你想修改你的代码,只需使用一个循环:
col_names = ['A', 'B', 'C', 'D']
for c in col_names:
if c not in df:
df[c] = None