Pandas数据帧中的状态转换计数



我正在处理一个熊猫数据帧,如下所示:

Items    Run    Status
T1       R1      OK
T2       R1      OK
T3       R1      OK
T4       R1      OK
T1       R2      NOK
T2       R2      OK
T3       R2      NOK
T4       R2      OK
T1       R3      OK
T2       R3      NOK
T3       R3      OK
T4       R3      OK

我想捕捉每个项目发生的状态转换(OK到NOK到OK,反之亦然(,以及状态转换发生的次数,如下所示:

Items    Run    Status   Transition   Count of Transition per Items
T1       R1      OK      0             1
T1       R2      NOK     1 
T1       R3      OK      0
T2       R1      OK      0             1
T2       R2      OK      0
T2       R3      NOK     1
T3       R1      OK      0             2
T3       R2      NOK     1
T3       R3      OK      1
T4       R1      OK      0             0
T4       R2      OK      0
T4       R3      OK      0

IIUC,您可以使用:

# sort values by Items and Run
df = df.sort_values(by=['Items', 'Run'])
# identify changes in Status
m = df['Status'].ne(df.groupby('Items')['Status'].shift().bfill())
# convert boolean to integer
df['Transition'] = m.astype(int)
# count per group masking the non-first values with empty string
df['Count'] = (m.groupby(df['Items']).transform('sum')
.mask(df['Items'].duplicated(), '') # optional
)

输出:

Items Run Status  Transition Count
0     T1  R1     OK           0     2
4     T1  R2    NOK           1      
8     T1  R3     OK           1      
1     T2  R1     OK           0     1
5     T2  R2     OK           0      
9     T2  R3    NOK           1      
2     T3  R1     OK           0     2
6     T3  R2    NOK           1      
10    T3  R3     OK           1      
3     T4  R1     OK           0     0
7     T4  R2     OK           0      
11    T4  R3     OK           0      

最新更新