比较两列中的值,并提取数据框中第三列的值



比较两列中的值并提取数据框中第三列的值

df =

<表类> 位置 团队目标tbody><<tr>151B6272B52C63B7

示例代码

data = {'Location': {0: 1, 1: 1, 2: 2, 3: 2, 4: 2, 5: 3},
'teams': {0: 'A', 1: 'B', 2: 'A', 3: 'B', 4: 'C', 5: 'B'},
'goals': {0: 5, 1: 6, 2: 7, 3: 5, 4: 6, 5: 7}}
df = pd.DataFrame(data)
首先

groupby的聚合

(df.groupby(['Location', 'teams'])['goals'].agg(['count', sum])
.unstack().swaplevel(0, 1, axis=1).sort_index(axis=1))

输出:

teams   A               B               C
count   sum     count   sum     count   sum
Location                        
1       1.0     5.0     1.0     6.0     NaN     NaN
2       1.0     7.0     1.0     5.0     1.0     6.0
3       NaN     NaN     1.0     7.0     NaN     NaN



第二

让我们创建idx来更改列

idx = pd.MultiIndex.from_product([df['teams'].unique(), ['Team', 'Team Goal']]).map(lambda x: ' '.join(x))

idx

Index(['A Team', 'A Team Goal', 'B Team', 'B Team Goal', 'C Team', 'C Team Goal'], dtype='object')



去年

>更改列和reset_index(包括第一个代码)

(df.groupby(['Location', 'teams'])['goals'].agg(['count', sum])
.unstack().swaplevel(0, 1, axis=1).sort_index(axis=1)
.set_axis(idx, axis=1).reset_index())

Location    A Team  A Team Goal B Team  B Team Goal C Team  C Team Goal
0   1           1.0     5.0         1.0     6.0         NaN     NaN
1   2           1.0     7.0         1.0     5.0         1.0     6.0
2   3           NaN     NaN         1.0     7.0         NaN     NaN

使用pivot重塑数据框架以获得目标。检查goals中的非空值以识别teams,然后join以获得结果

goals = df.pivot(*df.columns)
teams = s.notna().astype(int)
teams.add_suffix(' Team').join(goals.add_suffix(' Team Goals'))

结果

teams     A Team  B Team  C Team  A Team Goals  B Team Goals  C Team Goals
Location                                                                  
1              1       1       0           5.0           6.0           NaN
2              1       1       1           7.0           5.0           6.0
3              0       1       0           NaN           7.0           NaN

最新更新