根据列值确定优先级,然后选择行



我想在多列上给予优先级,然后根据优先级选择行

我想在类别列中选择具有RC优先级的ID,并在状态列中选择优先级,并相应地选择行

示例:输入dataframe

ID  Category      Status    Date
1   GC       Pending    01-03-2015
1   RC       Resolved   05-10-2016
1   GC       Resolved   06-03-2017
2   RC       Pending    09-08-2016
2   RC       Resolved   10-05-2014
3   GC       Resolved   10-08-2018
3   RC       Pending    13-05-2019
4   GC       Pending    10-06-2018
4   GC       Resolved   15-09-2014

输出数据框架

ID  Category      Status    Date
1   RC       Resolved   05-10-2016
2   RC       Pending    09-08-2016
3   RC       Pending    13-05-2019
4   GC       Pending    10-06-2018

通过将列表传递给categories参数,将列转换为具有设置优先级的订购分类,然后通过DataFrame.sort_values通过3列进行排序,最后删除用DataFrame.drop_duplicates

将副本删除。
df['Category'] = pd.Categorical(df['Category'], ordered=True, categories=['GC','RC'])
df['Status'] = pd.Categorical(df['Status'], ordered=True, categories=['Resolved','Pending'])
df = df.sort_values(['ID','Category','Status']).drop_duplicates('ID', keep='last')
print (df)
   ID Category    Status        Date
1   1       RC  Resolved  05-10-2016
3   2       RC   Pending  09-08-2016
6   3       RC   Pending  13-05-2019
7   4       GC   Pending  10-06-2018

最新更新