使用R在过程中的连续阶段之间连续事件



我一直在努力解决教科书中的一个练习,在这个练习中,我面临着在工业过程的连续阶段之间统计不同事件的挑战。

与过程相关的信息:让受试者经历三个阶段的过程,分别为A、B和C阶段,第一个阶段是A,第二个阶段是B,最后是C;受试者可以在a或B阶段放弃该过程,然后从a点重新开始,每次该过程发生时,都会创建一个数据集,其中包含受试者的标识、该阶段发生的时间戳和唯一的VISIT_CODE在任何阶段,受试者都可能触发";ALERT";这将与TIMESTAMP、ALERT_CODE和受试者标识一起记录。

要计算的内容:我必须在R中创建一个代码,以计算受试者在a和B阶段之间、B和C阶段之间生成的警报数量,以及最后在C阶段之后生成的警报的数量。请注意,受试者可能会在某个时候放弃该过程,稍后从a点重新开始。

这本教科书给出了一个提示:"仔细观察受试者当前所处的阶段,然后确定ALERT是否是从阶段a生成并在阶段B之前生成的,但要记住,如果测试对象在阶段a放弃并触发ALERT,如果该ALERT的TIMESTAMP小于他们在阶段a的下一次尝试;

作为另一个提示,教科书揭示了C阶段后的ALERTS只有1,并且它是由测试对象W-6用ALTER_CODE AYUJ-3915716168触发的。数据集为:

阶段过程

TableA<-tribble(~STAGE, ~TEST_SUBJECT,~TIMESTAMP,~VISIT_CODE,
"A",    "XYU-1",    "10",   "BKO",
"A",    "XYU-1",    "15",   "JUJD",
"B",    "XYU-1",    "20",   "DUDH",
"A",    "FF-09",    "25",   "KSIWJD",
"B",    "FF-09",    "30",   "AJAKAM",
"C",    "FF-09",    "35",   "ZISKS",
"A",    "UU-89",    "40",   "NNXJD",
"B",    "UU-89",    "45",   "DDUWO",
"A",    "I-44", "50",   "JIWIW",
"A",    "W-6",  "55",   "SHDN",
"B",    "W-6",  "60",   "IWOLS",
"C",    "W-6",  "65",   "JDDD",
"A",    "U-90", "70",   "DJDKSMS",
"B",    "U-90", "75",   "NDJSM",
"A",    "T-87", "80",   "DNDJDK")

警报数据集

TableB<-tribble(~TEST_SUBJECT,~TIMESTAMP,~ALERT_CODE,
"XYU-1",    "11",   "AYUJ-151571406",
"XYU-1",    "12",   "AYUJ-487008829",
"XYU-1",    "28",   "AYUJ-211990388",
"FF-09",    "32",   "AYUJ-4177221842",
"W-6",  "56",   "AYUJ-1300211351",
"W-6",  "63",   "AYUJ-3014305494",
"I-44", "67",   "AYUJ-4454800551",
"U-90", "73",   "AYUJ-1079921935",
"U-90", "76",   "AYUJ-3348911727",
"U-90", "79",   "AYUJ-2381219626",
"T-87", "82",   "AYUJ-4778326278",
"W-6",  "89",   "AYUJ-3915716168")

解决方案:

教科书指出,这个问题的正确解决方案是:

>td style="text align:central;">AYUJ-3915716168AYUJ-487008829AYUJ-1300211351AYUJ-1079921935AYUJ-4778326278AYUJ-44548000551
阶段A&B包括来自在A阶段第n次尝试中放弃过程的受试者的警报阶段B&C,包括在B阶段第n次尝试中放弃过程的受试者发出的警报C阶段后发出的警报
AYUJ-151571406AYU J-211990388

这是一个data.table通知,导致a-b-c之后的警报列表。。

library(data.table)
# Make tables data.table format
setDT(TableA)
setDT(TableB)
# set TiMESTAP to numeric
TableA[, TIMESTAMP := as.numeric(TIMESTAMP)]
TableB[, TIMESTAMP := as.numeric(TIMESTAMP)]
# Create data.table with Stage intervals by test subject
DT.interval <- TableA[, .(start = min(TIMESTAMP)), by = .(TEST_SUBJECT, STAGE)]
# Perform rolling join
TableB[, Stage := DT.interval[TableB, 
STAGE, 
on = .(TEST_SUBJECT, start = TIMESTAMP), 
roll = Inf]][]
# Split alerts by stage
split(TableB[,3:4], by = "Stage")
# $A
#         ALERT_CODE Stage
# 1:  AYUJ-151571406     A
# 2:  AYUJ-487008829     A
# 3: AYUJ-1300211351     A
# 4: AYUJ-4454800551     A
# 5: AYUJ-1079921935     A
# 6: AYUJ-4778326278     A
# 
# $B
#         ALERT_CODE Stage
# 1:  AYUJ-211990388     B
# 2: AYUJ-4177221842     B
# 3: AYUJ-3014305494     B
# 4: AYUJ-3348911727     B
# 5: AYUJ-2381219626     B
# 
# $C
#         ALERT_CODE Stage
# 1: AYUJ-3915716168     C

最新更新