我有一个字典和一个有两列的DataFrame:name
和salary
。我想对姓名与字典中各个值相匹配的薪水求和。这是我目前得到的。我想把经理、职员和分析师的工资分别加起来。
import pandas as pd
a = ['manager','sales','clerk','manager','analayst','sales','manager','analayst' ,'sales','clerk','clerk','analayst']
b = [45000,78000,12000,45000,96000,78000,56000,95000,84000,75000,95000,
26000]
df = pd.DataFrame({'name':a,'salary':b})
sum = 0
k = 0
c = []
for i in a:
if i not in c:
c.append(i)
for j in range(len(df)):
while k < len(c):
p = c[k]
print(p)
d = df[df['name'] == p]['salary'].sum()
k += 1[1]
您可以使用group by函数轻松分离经理,职员和分析师等总和。这有助于您轻松解决您的求和分隔形式
a=['manager','sales','clerk','manager','analayst','sales','manager','analayst' ,'sales','clerk','clerk','analayst']
b=[45000,78000,12000,45000,96000,78000,56000,95000,84000,75000,95000, 26000]
df=pd.DataFrame({'name':a,'salary':b})
df.groupby("name")["salary"].sum()
我猜你想要这样的输出-
>>> a=['manager','sales','clerk','manager','analayst','sales','manager','analayst' ,'sales','clerk','clerk','analayst']
>>> b=[45000,78000,12000,45000,96000,78000,56000,95000,84000,75000,95000,26000]
>>> import pandas as pd
>>>
>>> df=pd.DataFrame({'name':a,'salary':b})
>>>
>>>
>>> df.groupby('name').sum().reset_index()
name salary
0 analayst 217000
1 clerk 182000
2 manager 146000
3 sales 240000
>>>
如果是,你可以很容易地做到这一点,通过groupby,它使你能够分组/集群你的数据点,并在每个子集上,你可以执行你想要的聚合
使用说明:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.groupby.GroupBy.sum.html
下面的是结果的代码片段。
>>> import pandas as pd
>>> a = ['manager', 'sales', 'clerk', 'manager', 'analyst', 'sales', 'manager', 'analyst', 'sales', 'clerk', 'clerk', 'analyst']
>>> b = [45000, 78000, 12000, 45000, 96000, 78000, 56000, 95000, 84000, 75000, 95000, 26000]
>>> df = pd.DataFrame({'name': a, 'salary': b})
>>> df
name salary
0 manager 45000
1 sales 78000
2 clerk 12000
3 manager 45000
4 analyst 96000
5 sales 78000
6 manager 56000
7 analyst 95000
8 sales 84000
9 clerk 75000
10 clerk 95000
11 analyst 26000
>>> df.groupby(['name']).sum()
salary
name
analyst 217000
clerk 182000
manager 146000
sales 240000