如何评估一列数字以查找重复序列中的更改



我有一个100000行的表,其中一列是一系列重复的数字,代表机器人生产线上的"点"。我需要检测重复序列何时发生变化。序列由偶数和两边的奇数组成。因此,给定序列中的数字要么相隔1,要么相隔2,例如32、31、33、31、31、33,32、31、32、31,33、32。这三个数字的顺序是随机的,除了一个数字不能跟随自己。序列可以改变,例如33,32,31,33,31,31,32,32,33,44,45,44,43,45,43,44,29,30,31,29,31,30,30。

我希望能够检测序列中的每个更改,并用1标记每个更改(在没有更改的行中保留默认的0)。

在不确定一个数字是属于一个序列还是另一个序列的情况下,例如32、31、33、33、31、30、29、30、31(该序列已从一个基于32的序列变为一个基于30的序列),假设可以属于任一序列的最后一个数字属于"引入"序列。

编号序列中的每个步骤都有一个唯一的moveID,按设施和robotNum进行分区。

我试着用超前和滞后来推导"点"前后的数字,找出点前三个数字之和和和点后三个数字的和之间的差值,看看差值是否超过了某些极限,从而表明序列发生了变化。(请参阅以下代码)。

SELECT moveID, facility, robotNum, spot, 
CASE WHEN (((lastSpot1 + lastSpot2 + lastSpot3) - (nextSpot1 +  nextSpot2 +  nextSpot3))/3 BETWEEN -1.3 AND 1.3 ) 
THEN 1
ELSE 0 
END 
AS sequenceChange
FROM
( SELECT facility, robotNum,
, LEAD(spot,1,0) OVER(PARTITION BY facility, robotNum ORDER BY facility, robotNum, moveID ) AS nextSpot1
, LEAD(spot,2,0) OVER(PARTITION BY facility, robotNum ORDER BY facility, robotNum, moveID ) AS nextSpot2
, LEAD(spot,3,0) OVER(PARTITION BY facility, robotNum ORDER BY facility, robotNum, moveID ) AS nextSpot3
, LAG(spot,1,0) OVER(PARTITION BY facility, robotNum ORDER BY facility, robotNum, moveID ) AS lastSpot1
, LAG(spot,2,0) OVER(PARTITION BY facility, robotNum ORDER BY facility, robotNum, moveID ) AS lastSpot2
, LAG(spot,3,0) OVER(PARTITION BY facility, robotNum ORDER BY facility, robotNum, moveID ) AS lastSpot3
FROM SequenceTable1 
) t1
ORDER BY facility, robotNum, moveID.

然而,这种方法是不可靠的,因为一些不同序列的前3位数字之和重叠。需要某种方法来"放大"差异,以便更可靠地检测变化,并在每个新序列开始时在"sequenceChange"列中更新1。

问题中描述的数字序列之间有几个简单的关系。一个是序列中数字之间的差永远不会超过2。另一个是每个唯一序列中的偶数的模为2,有些序列的模为4。有了模,就可以很容易地找到奇数所属的偶数。我识别了所有可能的逻辑组合,涵盖了数字之间的差异,以检测序列的变化(有144个可能的组合由20个规则涵盖),并计算模数。以下是需要按顺序应用的规则的SQL事例语句。

CASE    
WHEN    Spot = 1 OR Spot = 2   THEN 2   
WHEN    Spot = 3 and (POWER(lastSpot1,4) + POWER(Spot,4) + POWER(nextSpot2,4)) <114   THEN 2    
WHEN    Spot = 3 and nextSpot1 =  5   THEN 4    
WHEN    Spot = 3 and nextSpot1 = 4    THEN 4    
WHEN    Spot%4=0   THEN 4   
WHEN    Spot%4!=0 and Spot%2=0   THEN 2 
WHEN    Spot%2=1 and lastSpot1%2=1 AND (Spot -2 =lastSpot1 OR Spot+2 = lastSpot1) AND ((Spot+lastSpot1)/2)%4=0   THEN 4 
WHEN    Spot%2=1 and lastSpot1%2=1 AND (Spot -2 = lastSpot1 OR Spot+2 = lastSpot1) AND ((Spot+lastSpot1)/2)%4!=0 and ((Spot+lastSpot1)/2)%2=0   THEN 2  
WHEN    Spot%2=1 AND lastSpot1%2=0 AND (Spot-1 = lastSpot1 OR Spot+1 = lastSpot1) AND lastSpot1%4=0    THEN 4   
WHEN    Spot%2=1 AND lastSpot1%2=0 AND (Spot-1 = lastSpot1 OR Spot+1 = lastSpot1) AND lastSpot1%4!=0 AND lastSpot1%2=0     THEN 2   
WHEN    Spot%2=1 AND lastSpot1%2=1 AND (Spot-2 > lastSpot1 OR Spot+2 <  lastSpot1) AND nextSpot1%2=1 AND (Spot -2 = nextSpot1 OR Spot+2 = nextSpot1) AND  ((Spot+nextSpot1)/2)%4=0   THEN 4 
WHEN    Spot%2=1 AND lastSpot1%2=1 AND (Spot-2 > lastSpot1 OR Spot+2 <  lastSpot1) AND nextSpot1%2=1 AND (Spot -2 = nextSpot1 OR Spot+2 = nextSpot1) AND  ((Spot+nextSpot1)/2)%4!=0 AND and ((Spot+nextSpot1)/2)%2=0   THEN 2   
WHEN    Spot%2=1 AND lastSpot1%2=1 AND (Spot-2 > lastSpot1 OR Spot+2 <  lastSpot1) AND nextSpot1%2=0 AND (Spot - 1 = nextSpot1 OR Spot+1 = nextSpot1) AND  nextSpot1%4=0   THEN 4   
WHEN    Spot%2=1 AND lastSpot1%2=1 AND (Spot-2 > lastSpot1 OR Spot+2 <  lastSpot1) AND nextSpot1%2=1 AND (Spot -2 = nextSpot1 OR Spot+2 = nextSpot1) AND   nextSpot1%4!=0 AND and nextSpot1%2=0   THEN 2    
WHEN    Spot%2=1 AND lastSpot1%2=0 AND (Spot-1 > lastSpot1 OR Spot+1 <  lastSpot1) AND ((nextSpot1%2=1 AND (Spot-2 > nextSpot1 OR Spot+2 <  nextSpot1) ) OR(nextSpot1%2=0 AND (Spot-1 > nextSpot1 OR Spot+1 <  nextSpot1)))   THEN lastSpotStart    
WHEN    Spot%2=1 AND lastSpot1%2=1 AND (Spot-2 > lastSpot1 OR Spot+2 <  lastSpot1) AND ((nextSpot1%2=1 AND (Spot-2 > nextSpot1 OR Spot+2 <  nextSpot1) ) OR(nextSpot1%2=0 AND (Spot-1 > nextSpot1 OR Spot+1 <  nextSpot1)))   THEN lastSpotStart    
WHEN    Spot%2=1 AND lastSpot1%2=0 AND (Spot-12 > lastSpot1 OR Spot+1 <  lastSpot1) AND nextSpot1%2=0 AND (Spot - 1 = nextSpot1 OR Spot+1 = nextSpot1) AND  nextSpot1%4=0   THEN 4  
WHEN    Spot%2=1 AND lastSpot1%2=0 AND (Spot-1 > lastSpot1 OR Spot+1 <  lastSpot1) AND nextSpot1%2=0 AND (Spot - 1 = nextSpot1 OR Spot+1 = nextSpot1) AND   nextSpot1%4!=0 AND and nextSpot1%2=0   THEN 2   
WHEN    Spot%2=1 AND lastSpot1%2=0 AND (Spot-1 > lastSpot1 OR Spot+1 < lastSpot1) AND nextSpot1%2=1 AND  (Spot-2 = nextSpot1 OR Spot+2 = nextSpot1)  AND ((Spot + nextSpot1)/2)%4=0    THEN 4   
WHEN    Spot%2=1 AND lastSpot1%2=0 AND (Spot-1 > lastSpot1 OR Spot+1 < lastSpot1) AND nextSpot1%2=1 AND  (Spot-2 = nextSpot1 OR Spot+2 = nextSpot1)  AND lastSpot1%4!=0 AND  ((Spot + nextSpot1)/2)%2=0   THEN 2    
ELSE 1      
END AS seqMod

接下来,为了计算每个序列中每个数字的索引(索引是序列的偶数),我使用了以下代码(感谢https://stackoverflow.com/users/15498/damien-the-unbeliever):

CASE 
WHEN seqMod = 4 AND (Spot = 1 OR Spot = 2) THEN 2  -- for when a robot that started at index 4 returns to index 2
ELSE 
CONVERT(int, ROUND((Spot +  CASE    WHEN seqMod = 2 THEN 2 ELSE 0 END )/4.0,0)* 4 - CASE    WHEN seqMod = 2 THEN 2 ELSE 0 END   )
END AS spotIndex

有了索引,很容易检测从一个序列到另一个的变化

最新更新