谁能帮我,因为我是python的新手。我有一个名为"购买数据"的数据集,其中每个 CaseID 的每个 PO 活动的日志数据都存在于数据集中。
Case Id Activity transactionstatus
1 Create Purchase Requisition Closed
1 Create Request for Quotation Requester Closed
1 Analyze Request for Quotation Closed
1 Send Request for Quotation to Supplier Closed
1 Create Quotation comparison Map Closed
1 Analyze Quotation comparison Map Closed
1 Choose best option Closed
1 Settle conditions with supplier Closed
1 Create Purchase Order Closed
1 Confirm Purchase Order Closed
1 Deliver Goods Services Closed
1 Release Purchase Order Closed
1 Approve Purchase Order for payment Closed
1 Send invoice Closed
1 Release Supplier's Invoice Closed
1 Authorize Supplier's Invoice payment Closed
1 Pay invoice Closed
在这里,每个案例 ID 都被视为一个变量。所以完全有这样的1949变量。
例如:案例 ID:1 被视为从活动列"创建采购申请"到"支付发票"的一个变量,然后交易状态被视为"已关闭"。(如上数据(
现在有很多caseID的事务状态为"打开",现在我正在尝试做的是我试图获取所有caseId和相应的活动(来自活动"创建..."的整个数据(直到"分析....",即事务状态为"打开"并尝试将其放置在单独的数据集中
例如:
Case ID Activity TransactionStatus
1941 Create Purchase Requisition Closed
1941 Analyze Purchase Requisition Closed
1941 Create Request for Quotation Requester Manager Closed
1941 Analyze Request for Quotation Open
1949 Create Purchase Requisition Closed
1949 Analyze Purchase Requisition Open
总共有 196 条记录具有未结交易状态! 任何人都可以帮助我以什么方式做到这一点
试试这个
df = df[df.loc[:, 'caseID'].isin(df[df['TransactionStatus'] == 'Open']['caseID'])]