使用熊猫赋值函数时的KeyError



我有下面的数据框,我希望创建新的变量"profit_ loss";以及";margin";基于收入&预算

revenue     budget
0      1513528810  150000000
1       378436354  150000000
2       295238201  110000000
3      2068178225  200000000
4      1506249360  190000000

我试图用pandasassign((方法创建新的变量,但我在下面遇到了错误。

d.assign(profit_loss = (d['revenue'] - d['budget']), 
profit_loss_margin = (d['profit_loss'] * 100 / d['revenue']), 
financial_status = d['profit_loss'].apply(lambda num: 'Profit-Making' if num > 0 else 'Loss- 
Making'))
/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)    2895                 return self._engine.get_loc(casted_key)    2896             except KeyError as err:
-> 2897                 raise KeyError(key) from err    2898     2899         if tolerance is not None:
KeyError: 'profit_loss'

不过,下面的代码运行得很好。

d.assign(profit_loss = (d['revenue'] - d['budget']))

请告知我在以前的代码中犯了什么错误吗?

您需要lambda来处理新创建的列,如这里的profit_loss:

df = d.assign(profit_loss = (d['revenue'] - d['budget']), 
profit_loss_margin = lambda x: (x['profit_loss'] * 100 / x['revenue']), 
financial_status =  lambda x: x['profit_loss'].apply(lambda num: 'Profit-Making' if num > 0 else 'Loss- Making'))
print (df)
revenue     budget  profit_loss  profit_loss_margin financial_status
0  1513528810  150000000   1363528810           90.089386    Profit-Making
1   378436354  150000000    228436354           60.363216    Profit-Making
2   295238201  110000000    185238201           62.741949    Profit-Making
3  2068178225  200000000   1868178225           90.329654    Profit-Making
4  1506249360  190000000   1316249360           87.385887    Profit-Making

您正在分配df变量"profit_loss",并试图在同一调用中使用它来分配新变量。Python将在调用函数之前解析所有参数。因此,当它试图在第二个和第三个参数中解析d['profit_loss']时,它还不存在,因为assign还没有被调用。尝试

d.assign(profit_loss = (d['revenue'] - d['budget']))
d.assign(profit_loss_margin = (d['profit_loss'] * 100 / d['revenue']), 
financial_status = d['profit_loss'].apply(lambda num: 'Profit-Making' if num > 0 else 'Loss-Making'))

最新更新