如何使用仅附录行计算运行总和



我有一个表,其中排从未突变而只插入;它们是不变的记录。它具有以下字段:

  • idint
  • user_idint
  • createddatetime
  • is_coolboolean
  • likes_fruitsboolean

一个对象与用户绑定,并且给定用户的"当前"对象是具有最新created日期的对象。例如。如果我想为用户更新is_cool,我将使用新的created时间戳和is_cool=true

附加记录

我想在每天结束时计算多少用户是is_cool。IE。我希望输出表具有列:

  • day:某种date_trunc('day', created)
  • cool_users_count:当天结束时具有is_cool的用户数量。

我可以编写什么SQL查询?fwiw我正在使用presto(如果需要的(如果需要((。

请注意,还有其他列,例如likes_fruits,这是指is_coolfalse的记录并不意味着is_cool仅更改为false-它可能是false

这是过程伪代码的外观,代表我想在SQL中做的事情:

// rows = ...
min_date = min([row.created for row in rows])
max_date = max([row.created for row in rows])
counts_by_day = {}
for date in range(min_date, max_date):
  rows_up_until_date = [row for row in rows if row.created <= date]
  latest_row_by_user = rows_up_until_date.reduce(
    {},
    (acc, row) => acc[row.user_id] = row,
  )
  counts_by_day[date] = latest_row_by_user.filter(row => row.is_cool).length

您可以使用jus查询来执行此操作。.尝试使用

的boolend和group上的总和
  select  date(created), sum(is_cool)
  from  my_table  
  group by date(created)

,或者如果您需要用户数

select t.date_created, count(*) num_user
from  (
  select  distinct date(created) date_created, user_id 
  from  my_table  
  where is_cool = TRUE 
 ) t 
 group by  t.date_created

或如果需要IS_COOL的最后值

select date(max_date), sum(is_cool)
from (
    select  t.user_id, t.max_date, m.is_cool, m.user_id 
    from my_table m  
    inner join  (
        select  max(date_created) max_date, user_id 
        from  my_table 
        group by  user_id, date(date_created)
    ) t on t.max_date  = m.date_created 
            and t.user_id  = m.user_id 
    where m.is_cool = TRUE 
) t2
group by date(max_date)

相关的子查询可能是最简单的解决方案。以下每个用户在每个日期获取is_cool的值:

select u.user_id, d.date,
       (select t.is_cool
        from t
        where t.user_id = u.user_id and
              t.created < dateadd(day, 1, d.date)
        order by t.created desc
        limit 1
       ) as is_cool
from (select distinct date(created) as date
      from t
     ) d cross join
     (select distinct user_id
      from t
     ) u ;

然后汇总:

select date, sum(is_cool)
from (select u.user_id, d.date,
             (select t.is_cool
              from t
              where t.user_id = u.user_id and
                    t.created < dateadd(day, 1, d.date)
              order by t.created desc
              limit 1
             ) as is_cool
      from (select distinct date(created) as date
            from t
           ) d cross join
           (select distinct user_id
            from t
           ) u
     ) ud
group by date;

相关内容

  • 没有找到相关文章