选择百分比返回给定两个时间段,但列值不同



我有一个查询来选择每月返回的用户的百分比,

 SELECT
(
    SELECT COUNT( DISTINCT t1.user )
    FROM mytable t1
    WHERE t1.month = 'November'
    AND EXISTS (
        SELECT *
        FROM mytable t2
        WHERE t2.user = t1.user
        AND t2.month = 'October'
    )
)
/
(
    SELECT COUNT( DISTINCT mytable.user )
    FROM mytable
    WHERE mytable.month = 'October'
) * 100

但是现在我想看看用户每个月返回什么,但去一个不同的事件,所以一个月参加一个活动,另一个月参加另一个活动,但不参加原来的活动的百分比。

伪代码应该是这样的:

SELECT  user IN event1 october AND users IN event2 november and users NOT IN event1 november /users IN colorado event1 * 100



   user month event 
   -----------------    
   jack october event2 
   jack november event3 
   jack november event3 
   jack november event2 
   jack december event3
   sam november event2 
   sam november event1
   sam october event1
   sam october event2   
   john october event1
   john november event2

通过执行INNER JOIN来获取用户在第二个月的活动,再加上LEFT JOIN在一个连接列中查找NULL以验证第一个月的事件不存在,您可以将其限制为叛逃者用户:

这个查询将只返回'john':

  SELECT
    event1.user
  FROM
    mytable event1
    -- Join to get the same users in the second month
    INNER JOIN mytable event2 ON event1.user = event2.user
    -- and a LEFT JOIN to find those who attend the month1 event in month2
    LEFT JOIN mytable event_excl 
       ON event2.user = event_excl.user 
       AND event2.month = event_excl.month
       AND event_excl.event = 'event1'
  WHERE
    event1.month = 'october' AND event1.event = 'event1'
    AND event2.month = 'november' AND event2.event = 'event2'
    -- Then include *only* those who don't match the event from the LEFT JOIN
    AND event_excl.event IS NULL

如果您将此查询包装在子选择中以计算您的百分比,它看起来像:

SELECT (
  SELECT
    COUNT(DISTINCT event1.user)
  FROM
    mytable event1
    INNER JOIN mytable event2 ON event1.user = event2.user
    LEFT JOIN mytable event_excl 
       ON event2.user = event_excl.user 
       AND event2.month = event_excl.month
       AND event_excl.event = 'event1'
  WHERE
    event1.month = 'october' AND event1.event = 'event1'
    AND event2.month = 'november' AND event2.event = 'event2'
    AND event_excl.event IS NULL
) / (
  SELECT COUNT(DISTINCT user)
  FROM mytable
  WHERE month = 'october'
  AND event = 'event1'
) * 100.0 AS defector_percent

…计算结果为50%,因为john, sam在10月和11月都存在,而john不存在,sam不存在。

在这里它是在行动: http://sqlfiddle.com/#!2/460ca/11

关于EXISTS 的注意:使用EXISTS / NOT EXISTS作为您的第一次发布尝试可以实现同样的事情,但我发现连接更容易映射关系。它也可能更快…或者,NOT IN(SELECT...)子查询可能具有与LEFT JOIN/IS NULL相似的性能。

最新更新