我有以下数据帧
VOTES CITY
24 A
22 A
20 B
NaN A
NaN A
30 B
NaN C
我需要用 CITY 为"A"或"C"的值平均值填充 NaN
我尝试的以下代码仅更新了 VOTES 中的第一行,其余代码都更新为 NaN。
train['VOTES'][((train['VOTES'].isna()) & (train['CITY'].isin(['A','C'])))]=train['VOTES'].loc[((~train['VOTES'].isna()) & (train['CITY'].isin(['A','C'])))].astype(int).mean(axis=0)
在此之后,"VOTES"的输出所有值都更新为"NaN",除了一条位于索引 0 的记录。虽然平均值计算正确。
仅对具有过滤行平均值的过滤行使用Series.fillna
:
train['VOTES_EN']=train['VOTES'].astype(str).str.extract(r'(-?d+.?d*)').astype(float)
m= train['CITY'].isin(['A','C'])
mean = train.loc[m,'VOTES_EN'].mean()
train.loc[m,'VOTES_EN']=train.loc[m,'VOTES_EN'].fillna(mean)
train['VOTES_EN'] = train['VOTES_EN'].astype(int)
print (train)
VOTES CITY VOTES_EN
0 24.0 A 24
1 22.0 A 22
2 20.0 B 20
3 NaN A 23
4 NaN A 23
5 30.0 B 30
6 NaN C 23