这里有一个我的工作代码示例,它实现了我想要的,但太长了。得到的数据帧是我想要的格式:
cc_idx_low = df_encounters.loc[((df_encounters['Specialty Type'] == 'Critical Care') & (df_encounters['Enloe Threshold'] == 'Low'))]
cc_idx_med = df_encounters.loc[((df_encounters['Specialty Type'] == 'Critical Care') & (df_encounters['Enloe Threshold'] == 'Medium'))]
cc_idx_high = df_encounters.loc[((df_encounters['Specialty Type'] == 'Critical Care') & (df_encounters['Enloe Threshold'] == 'High'))]
ms_idx_low = df_encounters.loc[((df_encounters['Specialty Type'] == 'Medical Surgical') & (df_encounters['Enloe Threshold'] == 'Low'))]
ms_idx_med = df_encounters.loc[((df_encounters['Specialty Type'] == 'Medical Surgical') & (df_encounters['Enloe Threshold'] == 'Medium'))]
ms_idx_high = df_encounters.loc[((df_encounters['Specialty Type'] == 'Medical Surgical') & (df_encounters['Enloe Threshold'] == 'High'))]
sc_idx_low = df_encounters.loc[((df_encounters['Specialty Type'] == 'Specialty Care') & (df_encounters['Enloe Threshold'] == 'Low'))]
sc_idx_med = df_encounters.loc[((df_encounters['Specialty Type'] == 'Specialty Care') & (df_encounters['Enloe Threshold'] == 'Medium'))]
sc_idx_high = df_encounters.loc[((df_encounters['Specialty Type'] == 'Specialty Care') & (df_encounters['Enloe Threshold'] == 'High'))]
ns_idx_low = df_encounters.loc[((df_encounters['Specialty Type'] == 'No Specialty') & (df_encounters['Enloe Threshold'] == 'Low'))]
ns_idx_med = df_encounters.loc[((df_encounters['Specialty Type'] == 'No Specialty') & (df_encounters['Enloe Threshold'] == 'Medium'))]
ns_idx_high = df_encounters.loc[((df_encounters['Specialty Type'] == 'No Specialty') & (df_encounters['Enloe Threshold'] == 'High'))]
cc_low_val = cc_idx_low['Encounter CSN'].nunique()
cc_med_val = cc_idx_med['Encounter CSN'].nunique()
cc_high_val = cc_idx_high['Encounter CSN'].nunique()
ms_low_val = ms_idx_low['Encounter CSN'].nunique()
ms_med_val = ms_idx_med['Encounter CSN'].nunique()
ms_high_val = ms_idx_high['Encounter CSN'].nunique()
sc_low_val = sc_idx_low['Encounter CSN'].nunique()
sc_med_val = sc_idx_med['Encounter CSN'].nunique()
sc_high_val = sc_idx_high['Encounter CSN'].nunique()
ns_low_val = ns_idx_low['Encounter CSN'].nunique()
ns_med_val = ns_idx_med['Encounter CSN'].nunique()
ns_high_val = ns_idx_high['Encounter CSN'].nunique()
# building the specialty range list using the values from above
df_encounter_counts=[['Critical Care',cc_low_val,cc_med_val,cc_high_val],
['Medical Surgical',ms_low_val,ms_med_val,ms_high_val],
['Specialty Care',sc_low_val,sc_med_val,sc_high_val],
['No Specialty',ns_low_val,ns_med_val,ns_high_val]]
# creating the ranges data frame for the list of values
df_encounter_counts = pd.DataFrame(df_encounter_counts,columns=['Specialty','Low','Medium','High'])
df_encounter_counts
我知道,在"专长类型"one_answers"基础阈值"上使用groupby((,在"遭遇CSN"上使用.nunique(((每个专长类型和阈值的唯一遭遇数:
示例数据:值的三列
df_encounters = df[['Encounter CSN','Specialty Type','Foundation Threshold']].copy()
df_enc_totals = df_encounters.groupby(['Specialty Type','Foundation Threshold'])['Encounter CSN'].nunique()
Specialty Type Foundation Threshold
Critical Care High 271
Low 165
Medium 439
Medical Surgical High 245
Low 2307
Medium 2724
No Specialty High 10
Low 1277
Medium 472
Specialty Care High 214
Low 906
Medium 1205
Name: Encounter CSN, dtype: int64
我想把";"低"中等";,以及";高";在他们自己的专栏下,下面有独特遭遇的计数,比如:
Specialty Type Low Medium High
Critical Care 165 439 271
Medical Surgical 2307 2724 245
Specialty Care 906 1205 214
No Specialty 1277 472 10
编辑:未堆叠的";"基础阈值";尝试格式化时出错
最终答案(标记答案和我的答案的组合(:
df_encounters = df[['Encounter CSN','Specialty Type','Enloe Threshold']].copy()
df_enc_totals = df_encounters.groupby(['Specialty Type','Enloe Threshold'])['Encounter CSN'].nunique()
df_enc_count = df_enc_totals.unstack('Enloe Threshold')
df_enc_count['Specialty'] = df_enc_count.index
df_enc_count.columns.name = ''
df_enc_count.index = range(0, 4, 1)
df_enc_count[['Specialty','Low','Medium','High']]```
这是您的数据帧:
df_enc_totals = pd.DataFrame({'': {('Critical Care', 'High'): 271,
('Critical Care', 'Low'): 165,
('Critical Care', 'Medium'): 439,
('Medical Surgical', 'High'): 245,
('Medical Surgical', 'Low'): 2307,
('Medical Surgical', 'Medium'): 2724,
('No Specialty', 'High'): 10,
('No Specialty', 'Low'): 1277,
('No Specialty', 'Medium'): 472,
('Specialty Care', 'High'): 214,
('Specialty Care', 'Low'): 906,
('Specialty Care', 'Medium'): 1205}}
)
df_enc_totals.index.names = ['Specialty Type', 'Foundation Threshold']
df_enc_totals
请尝试这个:
df_enc_totals = df_enc_totals.unstack('Foundation Threshold')
df_enc_totals.columns = df_enc_totals.columns.droplevel()
df_enc_totals = df_enc_totals[['Low', 'Medium','High']]
df_enc_totals.columns.name = ''
df_enc_totals.reset_index(inplace = True)
df_enc_totals