在Postgres中选择中间95%的分组结果集



i当前在测试API时具有记录RESTCALLS的表(下)。我需要做的是在指定的时间间隔内排除每个不同的RESTCALLNAME的极端情况(上/底部2.5%)。

我到目前为止最接近的是下面的代码返回我的表格,其中全部2.5%被排除在整个结果中。

     Column     |            Type             |       Modifiers
----------------+-----------------------------+------------------------
 timestamp      | timestamp without time zone | not null default now()
 testrunid      | character varying(255)      |
 sessionid      | character varying(255)      |
 restcallname   | character varying(255)      |
 completiontime | integer                     |

SELECT 
    restcallname, 
    count(restcallname) as noOfRestCalls, 
    round(avg(completiontime)) as avg_CompletionTime, 
    min(completiontime) as min_CompletionTime, 
    max(completiontime) as max_CompletionTime 
FROM (
    SELECT * 
    FROM requests
    WHERE 
        timestamp > NOW() - INTERVAL '1 week' 
    ORDER BY
        completiontime
    LIMIT (SELECT (COUNT(*) * 0.95)::integer FROM requests WHERE timestamp > NOW() - INTERVAL '1 week')
    OFFSET (SELECT (COUNT(*) * 0.025)::integer FROM requests WHERE timestamp > NOW() - INTERVAL '1 week')
) x
GROUP BY 
    restcallname 
ORDER BY 
    restcallname;

解决此问题或引用类似问题的任何建议?

我倾向于使用窗口函数进行此操作:

SELECT restcallname, 
       count(restcallname) as noOfRestCalls, 
       round(avg(completiontime)) as avg_CompletionTime, 
       min(completiontime) as min_CompletionTime, 
       max(completiontime) as max_CompletionTime 
FROM (SELECT r.*,
             ROW_NUMBER() OVER (ORDER BY completiontime) as seqnum,
             COUNT(*) OVER () as cnt
      FROM requests r
      WHERE  timestamp > NOW() - INTERVAL '1 week'
     ) r
WHERE seqnum >= 0.025 * cnt AND
      seqnum <= (1 - 0.025) * cnt
GROUP BY restcallname
ORDER BY restcallname;

最新更新