如果在不同的列上应用多个条件来创建新列,则会出现语法错误



我正试图为数据帧建立一个if else条件,但它似乎给了我无效的语法,数据如下:

df = pd.DataFrame(np.random.randint(0,30,size=10),
columns=["Random"],
index=pd.date_range("20180101", periods=10))
df=df.reset_index()
df['Recommandation']=['No', 'Yes', 'No', 'Yes', 'Yes', 'Yes', 'No', 'No', 'Yes', 'No']
df['diff']=[3,2,4,1,6,1,2,2,3,1]
df

我试图通过使用以下条件在'new'中创建另一个列:

If the 'index' is in the first three date, then, 'new'='random', 
elif the 'Recommendation' is yes, than 'new'= 'Value of the previous row of the random column'+'diff'
else: 'new'= 'Value of the previous row of the random column'

我的代码如下:

def my_fun(df, Recommendation, random, index, diff):
print (x)
if df[(df['index']=='2018-01-01')|(df['index']=='2018-01-02')|(df['index']=='2018-01-03')] :
x = df['random']
elif (df[df['recommendation']=='Yes']):
x = df['random'].shift(1)+df['diff']
else:
x = df['random'].shift(1)
return x    
#The expected output:
df['new'] = [22, 20, 10, 31, 26, 6, 27, 5, 10, 13]
df

按照您的条件,代码应该是:

import numpy as np
df['new'] = np.select([df['index'].isin(df['index'].iloc[:3]), df['Recommandation'].eq('Yes')],
[df['Random'], df['diff']+df['Random'].shift(1)],
df['Random'].shift(1)
)

输出:

index  Random Recommandation  diff   new
0 2018-01-01      22             No     3  22.0
1 2018-01-02      21            Yes     2  21.0
2 2018-01-03      29             No     4  29.0
3 2018-01-04      19            Yes     1  30.0
4 2018-01-05       1            Yes     6  25.0
5 2018-01-06       8            Yes     1   2.0
6 2018-01-07       0             No     2   8.0
7 2018-01-08       4             No     2   0.0
8 2018-01-09      27            Yes     3   7.0
9 2018-01-10      27             No     1  27.0

在else子句中,x[i]是错误的,因为没有定义i

如果你想在你的数据框架中添加一个列,你必须使用如下所示的void:


if ....:
df['newcol']=...
elif ...:
df['newcol']=...
.
.
.



最新更新