Pandas分组是指未堆叠的数据，然后将其绘制为水平堆叠的条形图

我有一个类似的数据集

Category    Date    Score_1     Score_2     Level   
A       1/1/2020    130         145     Excellent
A       1/5/2020    145         148     Excellent
C       1/2/2020    107         109     Need-Improvement
B       1/1/2020    125         128     Good
C       1/7/2020    105         107     Need-Improvement
B       1/2/2020    127         117     Good
A       1/12/2020   117         126     Good
C       1/12/2020   123         124     Good

数据集-2

Category    Mean    Excellent(%)  Good(%)   Need-Improvement(%)
A          130.6    66.67         33.33     0
B          126      0             100       0
C          111.6    0             66.67     33.33

我想从数据集1创建一个数据集-2，方法是从score_1值创建mean，并将级别值创建为列，并查找每个类别和级别的百分比值。为此，我写了

Df_90=pd.DataFrame()
Df_90["Mean"]=df.groupby('Category')["Score_1"].mean()
Df_90=D_90.reset_index()

这只实现了第一个，而没有拆封。所以尝试低于

df.groupby('Category')["Score_1"].mean().unstack('Level').head()

这引发了错误KeyError:"请求的级别(级别(与索引名称(类别(不匹配">

然后将图表绘制为每个类别的水平条形图，并按百分比的级别值堆叠。平均值决定棒的长度，而堆叠的棒则成为棒内的水平百分比。

您可以分别聚合平均值和水平频率，然后将其组合：

mean_score = df.groupby('Category').agg(Mean=('Score_1', 'mean'))
level_freq = (
df.groupby(['Category'])
['Level'].value_counts(normalize=True)
.mul(100)
.unstack(fill_value=0)
.rename_axis(columns=None)
.add_suffix('(%)')
)
result = pd.concat([mean_score, level_freq], axis=1)

相关内容

最新更新

热门标签：