我正在尝试更改pandas-df中的数据。使用下面的X >=5
,我想将相应的Y
行更改为1
。其中X <= -5
,我想更改相应的Y
0
。
# Generate random data
np.random.seed(2)
df = pd.DataFrame(np.random.randint(-10,10,size=(10, 1)), columns=list('X'))
df['X2'] = np.random.randint(1, 20, df.shape[0])
df['Y'] = np.random.randint(0, 2, df.shape[0])
df['Y'] = [y if y <= 5 else 1 for y in df['X']]
df['Y'] = [y if y >= -5 else 0 for y in df['X']]
输出:
X X2 Y
0 5 11 5
1 5 13 5
2 5 5 5
3 -7 3 0
4 2 8 2
5 -7 7 0
6 -4 2 -4
7 1 8 1
8 -7 14 0
9 -2 8 -2
预期:
X X2 Y
0 5 11 1
1 5 13 1
2 5 5 1
3 -7 3 0
4 2 8 Original random int
5 -7 7 0
6 -4 2 Original random int
7 1 8 Original random int
8 -7 14 0
9 -2 8 Original random int
只需使用np.where
:
import numpy as np
df['Y'] = np.where(df['X'].ge(5),1,df['Y'])
df['Y'] = np.where(df['X'].le(-5),0,df['Y'])
更好的是,对于多种条件,使用np.select
:
conditions=[df['X'].ge(5),df['X'].le(-5)]
choices=[1,0]
df['Y']=np.select(conditions,choices,default=df['Y'])
或者,如果您只想通过列表理解来完成,请使用zip
:
df['Y'] =[1 if x>=5 else(0 if x<=-5 else y)for x,y in zip(df['X'],df['Y'])]
输出:
original df
X X2 Y
0 -6 11 1
1 -10 10 0
2 6 15 1
3 9 12 0
4 -2 3 1
5 -5 2 0
6 5 6 1
7 -1 12 0
8 7 10 0
9 -6 9 0
df after np.where
X X2 Y
0 -6 11 0
1 -10 10 0
2 6 15 1
3 9 12 1
4 -2 3 1
5 -5 2 0
6 5 6 1
7 -1 12 0
8 7 10 1
9 -6 9 0