我有一个包含工资详细信息的员工数据集。我想添加一个额外的列来显示他们的工资组,如高/中/低:
数据:
Empno Sal Deptno
1 800 20
2 1600 30
3 2975 20
4 1250 30
5 2850 30
6 2450 10
7 3000 20
预期输出:
Empno Sal Deptno Sal_Group
1 800 20 low
2 1600 30 mid
3 2975 20 ...
4 1250 30 ...
5 2850 30 ...
6 2450 10 ...
7 3000 20 high
你可以试试这个:
import pandas as pd
import numpy as np
df = pd.read_csv("file.csv")
bins = np.linspace(min(df['Sal']), max(df['Sal']),4)
groupNames = ["low", "med", "high"]
df['SalGroup'] = pd.cut(df['Sal'], bins, labels = groupNames, include_lowest = True)
print(df)