python 数据集 - 读取一组列并将其放置在单独的数据帧中?



谁能帮我,因为我是python的新手。我有一个名为"购买数据"的数据集,其中每个 CaseID 的每个 PO 活动的日志数据都存在于数据集中。

Case Id     Activity                                 transactionstatus                                   
1           Create Purchase Requisition                     Closed
1           Create Request for Quotation Requester          Closed
1           Analyze Request for Quotation                   Closed
1           Send Request for Quotation to Supplier          Closed
1           Create Quotation comparison Map                 Closed
1           Analyze Quotation comparison Map                Closed
1           Choose best option                              Closed
1           Settle conditions with supplier                 Closed
1           Create Purchase Order                           Closed
1           Confirm Purchase Order                          Closed
1           Deliver Goods Services                          Closed
1           Release Purchase Order                          Closed
1           Approve Purchase Order for payment              Closed
1           Send invoice                                    Closed
1           Release Supplier's Invoice                      Closed
1           Authorize Supplier's Invoice payment            Closed
1           Pay invoice                                     Closed

在这里,每个案例 ID 都被视为一个变量。所以完全有这样的1949变量。

例如:案例 ID:1 被视为从活动列"创建采购申请"到"支付发票"的一个变量,然后交易状态被视为"已关闭"。(如上数据(

现在有很多caseID的事务状态为"打开",现在我正在尝试做的是我试图获取所有caseId和相应的活动(来自活动"创建..."的整个数据(直到"分析....",即事务状态为"打开"并尝试将其放置在单独的数据集中

例如:

Case ID Activity                                         TransactionStatus
1941    Create Purchase Requisition                            Closed
1941    Analyze Purchase Requisition                           Closed
1941    Create Request for Quotation Requester Manager         Closed
1941    Analyze Request for Quotation                           Open
1949    Create Purchase Requisition                            Closed
1949    Analyze Purchase Requisition                            Open

总共有 196 条记录具有未结交易状态! 任何人都可以帮助我以什么方式做到这一点

试试这个

df = df[df.loc[:, 'caseID'].isin(df[df['TransactionStatus'] == 'Open']['caseID'])]

最新更新