Python:用从同一列中选取的随机值填充数据框中的NaN



我有一个数据帧,其中有一些NaN值,如下所示,我想在同一列中随机选择的列中填充NaN值。例如,从Col1中随机选取值来填充Col1中的nan值

Col1      Col2      Col3      Col4   Col5
0  -0.671603 -0.792415  0.783922 NaN    Blue
1   0.207720       NaN  0.996131 Tom    Yellow
2  -0.892115 -1.282333       NaN Julia  NaN
3  -0.315598 -2.371529 -1.959646 NaN    Pink
4        NaN       NaN -0.584636 NaN    Orange
5   0.314736 -0.692732 -0.303951 Jim    NaN
6   0.355121       NaN       NaN NaN    Red
7        NaN -1.900148  1.230828 Sophia NaN
8  -1.795468  0.490953       NaN Anne   Blue
9  -0.678491 -0.087815       NaN NaN    NaN
10  0.755714  0.550589 -0.702019 NaN    Pink
11  0.951908 -0.529933  0.344544 Tobi   Yellow
12       NaN  0.075340 -0.187669 Jon    Red
13       NaN  0.314342 -0.936066 NaN    Yellow
14       NaN  1.293355  0.098964 Peter  Orange

有什么想法吗?

我试过这样做:

import numpy as np
import pandas as pd
num_nan= df[col_name].isna().sum()
for n in len(range(num_nan)):
#pick random value from e.g. col1 that's not NaN
df[col_name] = df[col_name].where((pd.notnull(df)), None).sample(random_state= 1)     
#replace NaN-value in e.g. col1 with picked value
df[col_name]= df.fillna('value')`

用同一列

中的随机选择来替换列中的nan值

您可以尝试:

for c in df:
mask = df[c].isna()
df.loc[mask, c] = np.random.choice(df.loc[~mask, c], size=(mask.sum(), 1))
print(df)

打印(例如):

Col1      Col2      Col3    Col4    Col5
0  -0.671603 -0.792415  0.783922     Jon    Blue
1   0.207720 -1.900148  0.996131     Tom  Yellow
2  -0.892115 -1.282333 -0.702019   Julia     Red
3  -0.315598 -2.371529 -1.959646    Tobi    Pink
4  -0.892115  0.075340 -0.584636     Jon  Orange
5   0.314736 -0.692732 -0.303951     Jim    Pink
6   0.355121 -0.792415  0.344544     Tom     Red
7  -0.892115 -1.900148  1.230828  Sophia     Red
8  -1.795468  0.490953 -0.303951    Anne    Blue
9  -0.678491 -0.087815  0.344544     Jon  Yellow
10  0.755714  0.550589 -0.702019   Peter    Pink
11  0.951908 -0.529933  0.344544    Tobi  Yellow
12 -0.678491  0.075340 -0.187669     Jon     Red
13  0.951908  0.314342 -0.936066   Julia  Yellow
14 -0.892115  1.293355  0.098964   Peter  Orange

最新更新