我有一个业务表,如以下人在线购买商品时所示。我想看到7天的保留率:每天,有多少人在第1天在第1天再次回到购物7。
customer_ID |purchase_date
1 |2017-01-01
2 |2017-01-01
3 |2017-01-01
2 |2017-01-06
2 |2017-01-07
这是我的Presto代码:
SELECT
COUNT(DISTINCT bp1.customer_ID) AS retained_customer,
bp1.purchase_date
FROM
business bp1,
business bp2
WHERE
bp1.customer_ID = bp2.customer_ID
AND CAST(bp2.purchase_date AS date) BETWEEN date_add('day', 1, CAST(bp1.purchase_date AS date))
AND date_add('day', 6, CAST(bp1.purchase_date AS date))
GROUP BY
2
ORDER BY
2
它永远运行,是否有人有一种更有效的方法来解决此问题
不确定Presto与查询有什么关系,但是这里有一个查询将提供您描述的信息:
sql小提琴
mySQL 5.6模式设置:
CREATE TABLE IF NOT EXISTS `business` (
`id` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'Primary Key',
`customer_id` INT(11) UNSIGNED NULL DEFAULT 0 COMMENT 'Use for a Foriegn Key or integer value',
`purchase_date` TIMESTAMP NOT NULL DEFAULT '2017-07-07' COMMENT '0 or 1 flag',
PRIMARY KEY (`id`)
)
ENGINE=MyISAM
AUTO_INCREMENT=1
DEFAULT CHARSET=utf8
COLLATE=utf8_unicode_ci
COMMENT '';
INSERT INTO `business`
(`customer_id`,`purchase_date`)
VALUES
(1,'2017-01-01'),
(2,'2017-01-01'),
(3,'2017-01-04'),
(2,'2017-01-06'),
(2,'2017-01-07'),
(3,'2017-01-05'),
(3,'2017-01-06');
查询1 :
SELECT
Count(DISTINCT b.customer_id) as `NumRetained`,
CAST(a.purchase_date as DATE) as `Purchase_Date`,
MIN(b.purchase_date) as `first_purchase`,
MAX(b.purchase_date) as `last_purchase`
FROM (SELECT
d.customer_id, MIN(d.purchase_date) as `purchase_date`
FROM business d
GROUP BY d.customer_id
) a
LEFT JOIN business b
ON a.customer_id = b.customer_id
AND CAST(b.purchase_date as DATE)
BETWEEN DATE_ADD(CAST(a.purchase_date AS DATE),INTERVAL 1 DAY) AND
DATE_ADD(CAST(a.purchase_date AS DATE),INTERVAL 6 DAY)
GROUP BY a.purchase_date
ORDER BY a.purchase_date
结果:
| NumRetained | Purchase_Date | first_purchase | last_purchase |
|-------------|---------------|----------------------|----------------------|
| 1 | 2017-01-01 | 2017-01-06T00:00:00Z | 2017-01-07T00:00:00Z |
| 1 | 2017-01-04 | 2017-01-05T00:00:00Z | 2017-01-06T00:00:00Z |