每周的记录总数



我有一个Postgres 9.1数据库。我正在尝试生成每周的记录数量(对于给定的日期范围),并将其与前一年进行比较。

我有以下代码用于生成该系列:

select generate_series('2013-01-01', '2013-01-31', '7 day'::interval) as series

但是,我不知道如何将计数的记录与生成的日期连接起来。

因此,以以下记录为例:

Pt_ID      exam_date
======     =========
1          2012-01-02
2          2012-01-02
3          2012-01-08
4          2012-01-08
1          2013-01-02
2          2013-01-02
3          2013-01-03
4          2013-01-04
1          2013-01-08
2          2013-01-10
3          2013-01-15
4          2013-01-24

我想让记录返回为:

  series        thisyr      lastyr
===========     =====       =====
2013-01-01        4           2
2013-01-08        3           2
2013-01-15        1           0
2013-01-22        1           0
2013-01-29        0           0

不确定如何引用子搜索中的日期范围。感谢您的帮助。

简单的方法是用@jpw演示的CROSS JOIN来解决这个问题。然而,存在一些隐藏的问题

  1. 无条件CROSS JOIN性能会随着行数的增长而快速恶化。在聚合中处理这个庞大的派生表之前,总行数乘以您正在测试的周数。索引也无济于事。

  2. 从1月1日开始的几周会导致不一致ISO周可能是一种选择。请参见下文。

以下所有查询都大量使用exam_date上的索引。一定要有一个。

仅联接到相关行

应该快得多

SELECT d.day, d.thisyr
     , count(t.exam_date) AS lastyr
FROM  (
   SELECT d.day::date, (d.day - '1 year'::interval)::date AS day0  -- for 2nd join
        , count(t.exam_date) AS thisyr
   FROM   generate_series('2013-01-01'::date
                        , '2013-01-31'::date  -- last week overlaps with Feb.
                        , '7 days'::interval) d(day)  -- returns timestamp
   LEFT   JOIN tbl t ON t.exam_date >= d.day::date
                    AND t.exam_date <  d.day::date + 7
   GROUP  BY d.day
   ) d
LEFT   JOIN tbl t ON t.exam_date >= d.day0      -- repeat with last year
                 AND t.exam_date <  d.day0 + 7
GROUP  BY d.day, d.thisyr
ORDER  BY d.day;

这是从1月1日开始的几周,就像你原来的一样。正如所评论的,这产生了一些不一致之处:每周从每年不同的一天开始,由于我们在年底中断,一年中的最后一周只有1或2天(闰年)。

与ISO周相同

根据要求,请考虑ISO周,该周从周一开始,通常为期7天。但它们跨越了岁月的边界。根据EXTRACT():上的文件

一年中一天所在的周数。根据定义(ISO 8601),周从星期一开始年份包含当年的1月4日。换句话说,第一一年中的星期四在该年的第一周。

在ISO定义中,1月初的日期可能是前一年第52周或第53周的一部分12月下旬是明年第一周的一部分。对于例如,2005-01-01是2004年第53周的一部分,并且2006-01-01是2005年第52周的一部分,而2012-12-31是2013年第一周的一部分。建议使用isoyear字段与CCD_ 8一起得到一致的结果。

以上查询用ISO周重写:

SELECT w AS isoweek
     , day::text  AS thisyr_monday, thisyr_ct
     , day0::text AS lastyr_monday, count(t.exam_date) AS lastyr_ct
FROM  (
   SELECT w, day
        , date_trunc('week', '2012-01-04'::date)::date + 7 * w AS day0
        , count(t.exam_date) AS thisyr_ct
   FROM  (
      SELECT w
           , date_trunc('week', '2013-01-04'::date)::date + 7 * w AS day
      FROM   generate_series(0, 4) w
      ) d
   LEFT   JOIN tbl t ON t.exam_date >= d.day
                    AND t.exam_date <  d.day + 7
   GROUP  BY d.w, d.day
   ) d
LEFT   JOIN tbl t ON t.exam_date >= d.day0     -- repeat with last year
                 AND t.exam_date <  d.day0 + 7
GROUP  BY d.w, d.day, d.day0, d.thisyr_ct
ORDER  BY d.w, d.day;

1月4日总是在一年中ISO的第一周。因此,这个表达式得到了给定年份的第一个ISO周的星期一:

date_trunc('week', '2012-01-04'::date)::date

使用EXTRACT()简化

由于ISO周数与EXTRACT()返回的周数一致,我们可以简化查询。首先,一个简短而简单的形式:

SELECT w AS isoweek
     , COALESCE(thisyr_ct, 0) AS thisyr_ct
     , COALESCE(lastyr_ct, 0) AS lastyr_ct
FROM   generate_series(1, 5) w
LEFT   JOIN (
   SELECT EXTRACT(week FROM exam_date)::int AS w, count(*) AS thisyr_ct
   FROM   tbl
   WHERE  EXTRACT(isoyear FROM exam_date)::int = 2013
   GROUP  BY 1
   ) t13  USING (w)
LEFT   JOIN (
   SELECT EXTRACT(week FROM exam_date)::int AS w, count(*) AS lastyr_ct
   FROM   tbl
   WHERE  EXTRACT(isoyear FROM exam_date)::int = 2012
   GROUP  BY 1
   ) t12  USING (w);

优化查询

更多细节相同,并针对性能进行了优化

WITH params AS (          -- enter parameters here, once 
   SELECT date_trunc('week', '2012-01-04'::date)::date AS last_start
        , date_trunc('week', '2013-01-04'::date)::date AS this_start
        , date_trunc('week', '2014-01-04'::date)::date AS next_start
        , 1 AS week_1
        , 5 AS week_n     -- show weeks 1 - 5
   )
SELECT w.w AS isoweek
     , p.this_start + 7 * (w - 1) AS thisyr_monday
     , COALESCE(t13.ct, 0) AS thisyr_ct
     , p.last_start + 7 * (w - 1) AS lastyr_monday
     , COALESCE(t12.ct, 0) AS lastyr_ct
FROM params p
   , generate_series(p.week_1, p.week_n) w(w)
LEFT   JOIN (
   SELECT EXTRACT(week FROM t.exam_date)::int AS w, count(*) AS ct
   FROM   tbl t, params p
   WHERE  t.exam_date >= p.this_start      -- only relevant dates
   AND    t.exam_date <  p.this_start + 7 * (p.week_n - p.week_1 + 1)::int
-- AND    t.exam_date <  p.next_start      -- don't cross over into next year
   GROUP  BY 1
   ) t13  USING (w)
LEFT   JOIN (                              -- same for last year
   SELECT EXTRACT(week FROM t.exam_date)::int AS w, count(*) AS ct
   FROM   tbl t, params p
   WHERE  t.exam_date >= p.last_start
   AND    t.exam_date <  p.last_start + 7 * (p.week_n - p.week_1 + 1)::int
-- AND    t.exam_date <  p.this_start
   GROUP  BY 1
   ) t12  USING (w);

在指数支持下,这应该非常快,并且可以很容易地适应所选择的区间。上次查询中generate_series()的隐式JOIN LATERAL需要Postgres 9.3

SQL Fiddle

使用cross join应该可以工作,我将在下面粘贴SQL Fiddle的markdown输出。您的2013-01-08系列的样本输出似乎不正确:thisyr应该是2,而不是3。虽然这可能不是最好的方法,但我的Postgresql知识还有很多不足之处。

SQL Fiddle

PostgreSQL 9.2.4架构设置:

CREATE TABLE Table1
    ("Pt_ID" varchar(6), "exam_date" date);
INSERT INTO Table1
    ("Pt_ID", "exam_date")
VALUES
    ('1', '2012-01-02'),('2', '2012-01-02'),
    ('3', '2012-01-08'),('4', '2012-01-08'),
    ('1', '2013-01-02'),('2', '2013-01-02'),
    ('3', '2013-01-03'),('4', '2013-01-04'),
    ('1', '2013-01-08'),('2', '2013-01-10'),
    ('3', '2013-01-15'),('4', '2013-01-24');

查询1

select 
  series, 
  sum (
    case 
      when exam_date 
        between series and series + '6 day'::interval
      then 1 
      else 0 
    end
  ) as thisyr,
  sum (
    case 
      when exam_date + '1 year'::interval 
        between series and series + '6 day'::interval
      then 1 else 0 
    end
  ) as lastyr
from table1
cross join generate_series('2013-01-01', '2013-01-31', '7 day'::interval) as series
group by series
order by series

结果

|                         SERIES | THISYR | LASTYR |
|--------------------------------|--------|--------|
| January, 01 2013 00:00:00+0000 |      4 |      2 |
| January, 08 2013 00:00:00+0000 |      2 |      2 |
| January, 15 2013 00:00:00+0000 |      1 |      0 |
| January, 22 2013 00:00:00+0000 |      1 |      0 |
| January, 29 2013 00:00:00+0000 |      0 |      0 |

相关内容

  • 没有找到相关文章

最新更新