Pandas—根据多个范围分配字符串值



我创建了一个小函数,根据另一列的范围为一列分配字符串值:3.2 == '0-6m', 7 == '6-12m'但我得到这个错误:TypeError: 'float' object is not subscriptable

Dataframe

StartingHeight
4.0
3.2
8.0
32.0
12.0
18.3

预期输出:

StartingHeight height_factor
4.0          0-6m
3.2          0-6m
8.0         6-12m
32.0          >30m
12.0         6-12m
18.3        18-24m

代码:

def height_bands(hbcol):
"""Apply string value based on float value ie: 6.2 == '6-12m
hb_values = ['0-6m', '6-12m', '12-18m', '18-24m', '24-30m', '>30m']"""
if (hbcol['StartingHeight'] >= 0) | (hbcol['StartingHeight'] < 6.1):
return '0-6m'
elif (hbcol['StartingHeight'] >= 6.1) | (hbcol['StartingHeight'] < 12):
return '6-12m'
elif (hbcol['StartingHeight'] >= 12) | (hbcol['StartingHeight'] < 18):
return '12-18m'
elif (hbcol['StartingHeight'] >= 18) | (hbcol['StartingHeight'] < 24):
return '18-25m'
else:
return '>30m'

df1['height_factor'] = df1.apply(lambda x: height_bands(x['StartingHeight']), axis=1)

谢谢你的帮助!

您可以使用pd.cut:

df['height_factor'] = pd.cut(df['StartingHeight'],
bins=[0, 6, 12, 18, 24, 30, np.inf],
labels=['0-6m', '6-12m', '12-18m',
'18-24m', '24-30m', '>30m'],
right=False)

输出:

>>> df
StartingHeight height_factor
0             4.0          0-6m
1             3.2          0-6m
2             8.0         6-12m
3            32.0          >30m
4            12.0         6-12m
5            18.3        18-24m

修复

最新更新