示例数据帧如下所示:
d = {'id': [1, 2],'year': [1999, 2020],'cat_up': [0, 15], 'cat_down': [2, 60],'dog_up': [1, 11], 'dog_down': [2, 12],'fish_up': [12, 11], 'fish_down': [22, 12]}
df = pd.DataFrame(data=d)
id year cat_up cat_down dog_up dog_down fish_up fish_down
0 1 1999 0 2 1 2 12 22
1 2 2020 15 60 11 12 11 12
真正的数据帧还有大约60多列,都有_up或_down后缀。
我想通过划分每个_up/_down来添加新的_res列,例如:
df['cat_res'] = df['cat_up']/df['cat_down']
df['dog_res'] = df['dog_up']/df['dog_down']
...
我试着使用for循环,但下面的一个是不对的:
up_cols = [col for col in df.columns if '_up' in col]
down_cols = [col for col in df.columns if '_down' in col]
for up, down in enumerate(up_cols, down_cols):
df[column + '_res'] = up_cols/ down_cols
有人能告诉我如何计算和迭代列吗?
如果你想小心你的列,因为你的数据帧可能有带有_up
后缀的列,而与任何其他带有_down
后缀的列无关,你可以使用集合。
up_cols = [col[:-3] for col in df.columns if col.endswith("_up")]
down_cols = [col[:-5] for col in df.columns if col.endswith("_down")]
# Using sets intersection to address the case when there is a X_up but no X_down
for col in set(up_cols)&set(down_cols):
df[col+"_res"] = df[col+"_up"]/df[col+"_down"]
for up, down in zip(up_cols, down_cols):
df[up.replace("_up", "_res")] = df[up]/df[down]