如何从pandas列中删除特定字符串匹配的单元格



我当前的熊猫数据帧:


Subject         Test1               Test2               Test3           
Python           45                 Temp                25              
50                 Temp                Temp              
Temp                 30                Temp              
Temp                 35                Temp                
Temp                 38                Temp               
OS               Temp                Temp                45
33                  24                  32      
43                  34                  44                                  
Temp                43                  31
Java             33                  32                  34
43                  33                  36  

预期输出:


Subject         Test1               Test2               Test3           
Python           45                  30                  25              
50                  35
38                                                                                                           
OS               33                  24                  45
43                  34                  32      
43                  44                                  
31
Java             33                  32                  34
43                  33                  36                      

我需要删除"Temp"匹配的单元格。在这种情况下,整排往下掉是行不通的,所以我被困在这里了。请帮忙。

如果Subject中没有值为空字符串,首先将其替换为缺失的值,然后在第一步中向前填充:

df['Subject'] = df['Subject'].replace('',np.nan).ffill()

然后使用自定义函数删除GroupBy.apply:中每组的Temp

def f(x):
return x.apply(lambda x: pd.Series(x[x.ne('Temp')].tolist(),name=x.name))
df = (df.set_index('Subject')
.groupby('Subject', sort=False)
.apply(f)
.reset_index(level=1, drop=True)
.reset_index())
print (df)
Subject Test1 Test2 Test3
0  Python    45    30    25
1  Python    50    35   NaN
2  Python   NaN    38   NaN
3      OS    33    24    45
4      OS    43    34    32
5      OS   NaN    43    44
6      OS   NaN   NaN    31
7    Java    33    32    34
8    Java    43    33    36

或者在处理完Subject之后,首先由DataFrame.melt取消透视,删除Temp行,并由GroupBy.cumcount为具有DataFrame.pivot:的辅助列使用计数器进行透视

df['Subject'] = df['Subject'].replace('',np.nan).ffill()
df = df.melt('Subject')
df = df[df['value'].ne('Temp')].copy()
df['g'] = df.groupby(['Subject','variable']).cumcount()
df = (df.pivot(['Subject','g'],'variable','value') 
.reset_index(level=1, drop=True)
.reset_index()
.rename_axis(None, axis=1))
print (df)
Subject Test1 Test2 Test3
0    Java    33    32    34
1    Java    43    33    36
2      OS    33    24    45
3      OS    43    34    32
4      OS   NaN    43    44
5      OS   NaN   NaN    31
6  Python    45    30    25
7  Python    50    35   NaN
8  Python   NaN    38   NaN

最新更新