循环处理多个数据帧并分别输出熊猫



我有两个独立的数据帧,我想对它们做同样的事情。我想根据下划线(a,b,x,y(之前的第一个子字符串对列进行配对,然后如果第一列中的值包含一个单词,但totals列中对应的行为null,我想将total更新为零。我想更新数据帧,然后分别输出它们。

import pandas as pd
import numpy as np
d1 = pd.DataFrame(data={'a':['yes', 'no', 'maybe', 'sometimes', np.nan],
'a_total': [5,12,4,np.nan,0],
'b': ['blue','orange','pink', np.nan, np.nan],
'b_total': [12,6,0,0, np.nan]})
d2 = pd.DataFrame(data={'y':['frog', 'snail', 'snake', 'spider', 'pig'],
'y_total': [182,32,13, np.nan,8],
'z': ['car','bike','walk', np.nan, np.nan],
'z_total': [12,6,np.nan,np.nan, np.nan]})

然后我想对这两个数据帧做点什么,然后分别输出更新后的版本。我在下面复制的当前代码输出不正确。我正试图输出一组数据帧,但如果我能以某种方式输出两个数据帧(d1和d2(,那也很好。

out = {}
for i, df in enumerate([d1, d2]):        
key_id = [*df.loc[:,~df.columns.str.endswith('total')].columns]
totals = [*df.loc[:,df.columns.str.endswith('total')].columns]
for col in key_id: 
pairs = df.loc[:, df.columns.str.startswith(col)]
pairs[col+'_total'].loc[(pairs[col].notnull()) & (pairs[col+'_total'].isnull())] = 0
out[i] = pd.concat([pairs], axis=1)

谢谢你看

我不确定我是否完全理解您在输出中的需求,但这可能有效?

import pandas as pd
import numpy as np
d1 = pd.DataFrame(data={'a':['yes', 'no', 'maybe', 'sometimes', np.nan],
'a_total': [5,12,4,np.nan,0],
'b': ['blue','orange','pink', np.nan, np.nan],
'b_total': [12,6,0,0, np.nan]})
d2 = pd.DataFrame(data={'y':['frog', 'snail', 'snake', 'spider', 'pig'],
'y_total': [182,32,13, np.nan,8],
'z': ['car','bike','walk', np.nan, np.nan],
'z_total': [12,6,np.nan,np.nan, np.nan]})
#show d1 before making changes
print(d1)
#make the changes directly to d1 and d2
for i, df in enumerate([d1, d2]):        
cols = [c for c in df.columns if not c.endswith('total')]

for col in cols: 
tot_col = col+'_total'    
df.loc[df[col].notnull() & df[tot_col].isnull(), tot_col] = 0


#show d1 after making changes
print(d1)

更改前d1:

a  a_total       b  b_total
0        yes      5.0    blue     12.0
1         no     12.0  orange      6.0
2      maybe      4.0    pink      0.0
3  sometimes      NaN     NaN      0.0
4        NaN      0.0     NaN      NaN

更改后的d1:

a  a_total       b  b_total
0        yes      5.0    blue     12.0
1         no     12.0  orange      6.0
2      maybe      4.0    pink      0.0
3  sometimes      0.0     NaN      0.0
4        NaN      0.0     NaN      NaN

最新更新