我正试图为数据帧建立一个if else条件,但它似乎给了我无效的语法,数据如下:
df = pd.DataFrame(np.random.randint(0,30,size=10),
columns=["Random"],
index=pd.date_range("20180101", periods=10))
df=df.reset_index()
df['Recommandation']=['No', 'Yes', 'No', 'Yes', 'Yes', 'Yes', 'No', 'No', 'Yes', 'No']
df['diff']=[3,2,4,1,6,1,2,2,3,1]
df
我试图通过使用以下条件在'new'中创建另一个列:
If the 'index' is in the first three date, then, 'new'='random',
elif the 'Recommendation' is yes, than 'new'= 'Value of the previous row of the random column'+'diff'
else: 'new'= 'Value of the previous row of the random column'
我的代码如下:
def my_fun(df, Recommendation, random, index, diff):
print (x)
if df[(df['index']=='2018-01-01')|(df['index']=='2018-01-02')|(df['index']=='2018-01-03')] :
x = df['random']
elif (df[df['recommendation']=='Yes']):
x = df['random'].shift(1)+df['diff']
else:
x = df['random'].shift(1)
return x
#The expected output:
df['new'] = [22, 20, 10, 31, 26, 6, 27, 5, 10, 13]
df
按照您的条件,代码应该是:
import numpy as np
df['new'] = np.select([df['index'].isin(df['index'].iloc[:3]), df['Recommandation'].eq('Yes')],
[df['Random'], df['diff']+df['Random'].shift(1)],
df['Random'].shift(1)
)
输出:
index Random Recommandation diff new
0 2018-01-01 22 No 3 22.0
1 2018-01-02 21 Yes 2 21.0
2 2018-01-03 29 No 4 29.0
3 2018-01-04 19 Yes 1 30.0
4 2018-01-05 1 Yes 6 25.0
5 2018-01-06 8 Yes 1 2.0
6 2018-01-07 0 No 2 8.0
7 2018-01-08 4 No 2 0.0
8 2018-01-09 27 Yes 3 7.0
9 2018-01-10 27 No 1 27.0
在else子句中,x[i]是错误的,因为没有定义i
如果你想在你的数据框架中添加一个列,你必须使用如下所示的void:
if ....:
df['newcol']=...
elif ...:
df['newcol']=...
.
.
.