如何在PostgreSQL中创建一个函数来循环另一个函数



我使用的是PostgreSQL 9.3.9,我有一个名为list_all_upsells的过程,该过程在月初和月末进行。(见sqlfiddle.com/# !例如,下面的代码将列出10月份的追加销售账户数量:

select COUNT(up.*) as "Total Upsell Accounts in October" from 
list_all_upsells('2015-10-01 00:00:00'::timestamp, '2015-10-31 23:59:59'::timestamp) as up
where up.user_id not in
(select distinct user_id from paid_users_no_more 
where concat(extract(month from payment_stop_date),'-',extract(year from payment_stop_date))<>
concat(extract(month from payment_start_date),'-',extract(year from payment_start_date)));

list_all_upsells过程如下:

DECLARE
payor_email_2 text;
   BEGIN
FOR payor_email_2 in select distinct payor_email from paid_users LOOP
return query
execute
'select paid_users.* from paid_users,
(
select payment_start_date as first_time from paid_users
where payor_email = $3
order by payment_start_date limit 1
) as dummy
where payor_email = $3
and payment_start_date > first_time
and payment_start_date between $1 and $2
and first_time < $1'
using a, b, payor_email_2;
END LOOP;
return;
END

我希望能够运行这个所有月份,我们有记录和查询数据在一个表中,像这样:

Month   | Total Upselled Accounts
---------------------------------
08/2014 | 23
09/2014 | 35
ETC...
10/2015 | 56

我有一个查询来获取每个月的第一个月和每个月的最后一个月,我们已经开展业务:

select distinct date_trunc('month', payment_start_date)::date as startmonth
from paid_users ORDER BY startmonth;

最后一个月:

SELECT distinct (date_trunc('MONTH', payment_start_date) + 
INTERVAL '1 MONTH - 1 day')::date as endmonth from paid_users 
ORDER BY endmonth;

现在我该如何创建一个函数来遍历list_all_upsells并获取每个月的计数?例如,对startmonth的第一个查询给出了2014-03-01,2014-04-01,…到2015-10-01,而第二次查询endmonth给我2014-03-31,2014-04-30,…到2015-10-31。我想在每个月都运行list_all_sells,这样我就可以得到每个月我们有多少个追加销售账户的汇总计数

我的paid_users表是这样的:

CREATE TABLE paid_users
(
  user_id integer,
  user_email character varying(255),
  payor_id integer,
  payor_email character varying(255),
  payment_start_date timestamp without time zone DEFAULT now()
)

paid_users_no_more:

CREATE TABLE paid_users_no_more
(
  user_id integer,
  payment_stop_date timestamp without time zone DEFAULT now()
)

您的函数有几个问题,所以让我们从这里开始。它的不足之处在于(1)您只需要一个参数来表示月份,使用月初和月末是在为自己设置问题;(2)你不需要动态查询,因为你不需要改变标识符(表名或列名);(3)不需要循环;(4)你的逻辑是错误的。我还可以提到PostgreSQL使用函数,并且它们都以CREATE FUNCTION list_all_upsells(...)这样的行开头,但这太挑剔了。

从逻辑开始:显然,通过他的电子邮件地址识别的用户从某个payment_start_date提取订阅,直到某个payment_stop_date,并且可以多次执行此操作。您要查找的是那些在相关月份之前进行了首次订阅的用户,以及在相关月份开始了新订阅但不是首次订阅的用户。在这种情况下,过滤器payment_start_date > first_time是无用的,因为您已经过滤了有关月份之前的首次订阅(first_time < $1)和新订阅(payment_start_date BETWEEN $1 AND $2)。

点(1)、(2)和(3)只有在函数内部重写查询时才会变得明显:

CREATE FUNCTION list_all_upsells(timestamp) RETURNS SETOF paid_users AS $$
  SELECT paid_users.*
  FROM paid_users
  JOIN (  -- This JOIN keeps only those rows where the payor_email has a prior subscription
    SELECT DISTINCT payor_email,
           first_value(payment_start_date) OVER (PARTITION BY payor_email ORDER BY payment_start_date) AS dummy
    FROM paid_users
    WHERE payment_start_date < date_trunc('month', $1)
  ) dummy USING (payor_email)
  -- This filter keeps only those rows with new subscriptions in the month
  WHERE date_trunc('month', payment_start_date) = date_trunc('month', $1)
$$ LANGUAGE sql STRICT;

由于函数体简化为单个SQL语句,因此该函数现在是sql语言函数,这比plpgsql更有效。您现在只提供一个参数,它可以是您希望获得数据的月份中的任何时刻,因此list_all_upsells(LOCALTIMESTAMP)将为您提供当前月份的结果。就你发布的查询而言,它将是:

SELECT count(up.*) AS "Total Upsell Accounts in October"
FROM list_all_upsells(LOCALTIMESTAMP) up
WHERE up.user_id NOT IN 
  (SELECT DISTINCT user_id FROM paid_users_no_more 
   WHERE date_trunc('month', payment_stop_date) <>
         date_trunc('month', up.payment_start_date)
  );
顺便说一句,

这确实回避了为什么要使用表paid_users_no_more的问题。为什么不简单地将列payment_stop_date添加到表paid_users呢?如果该列为NULL,则用户仍然订阅。但是整个查询相当奇怪,因为list_all_upsells()在当月返回新订阅,那么为什么要在的其他时间取消订阅呢?

现在回到你真正的问题:

SELECT months.m "Month", coalesce(count(up.*), 0) "Total Upselled Accounts"
FROM generate_series('2014-08-01'::timestamp,
                     date_trunc('month', LOCALTIMESTAMP),
                     '1 month') AS months(m)
LEFT JOIN list_all_upsells(months.m) AS up ON date_trunc('month', payment_start_date) = m
GROUP BY 1
ORDER BY 1;

生成从某个起始月份到当前月份的一系列月份,然后计算每个月的新订阅数,可能为0。

SQLFiddle

最新更新