SQLite对自引用表中的子表求和



我有一个ecosystems表,它是自引用的,因为每个生态系统都可以有子生态系统。每个生态系统也可以有一个分数(代表生态系统的健康程度)。列是(带有示例数据):

| slug | parent_slug | full_slug   | score |
| ---- | ----------- | ----------- | ----- |
| aaa  | null        | aaa         | 1     |
| bbb  | aaa         | aaa/bbb     | 2     |
| ccc  | bbb         | aaa/bbb/ccc | 4     |
| ddd  | null        | ddd         | 8     |
| eee  | ddd         | ddd/eee     | 16    |
| fff  | null        | fff         | 32    |

full_slug列表示从顶层生态系统向下的完整路径。这是多余的,因为它可以从slugparent_slug列中推断出来,但它确实存在。

我想要实现的是创建一个具有相同行数的查询,但有一个列total_score,它计算每个生态系统的分数加上它的所有子生态系统的分数,递归地。也就是说,输出应该是:

| slug | total_score |
| ---- | ----------- |
| aaa  | 7           |
| bbb  | 6           |
| ccc  | 4           |
| ddd  | 24          |
| eee  | 16          |
| fff  | 32          |

我开始了下面的查询:

WITH top AS (
SELECT
SUM(e.score) as total_score,
CASE instr(e.full_slug, '/') WHEN 0 THEN
e.full_slug
ELSE
substr(e.full_slug, 0, instr(e.full_slug, '/'))
END AS top_level_eco
FROM ecosystems e
GROUP BY top_level_eco
)
SELECT
e.slug,
top.total_score
FROM ecosystems e
INNER JOIN top on top.top_level_eco = e.slug;

但不幸的是,它只显示了顶级生态系统及其总分。

我可以想到几个答案…

一般的答案是使用递归…

WITH
tree AS
(
SELECT
slug   AS base_slug,
slug   AS current_slug,
score  AS score
FROM
ecosystems
UNION ALL
SELECT
t.base_slug,
e.slug,
e.score
FROM
tree        t
INNER JOIN
ecosystems  e
ON e.parent_slug = t.current_slug
)
SELECT
base_slug    AS slug,
SUM(score)   AS total_score
FROM
tree
GROUP BY
base_slug
ORDER BY
base_slug

另一个选择是在JOIN中使用full_slug,尽管这将禁止使用索引,并且通常比上面的一般解决方案要慢得多。

SELECT
e.slug,
SUM(m.score)   AS total_score
FROM
ecosystems    e
INNER JOIN
ecosystems    m  -- members
ON '/' || m.full_slug || '/' LIKE '%/' || e.slug || '/%'
GROUP BY
e.slug
ORDER BY
e.slug

第三种方法是unnest/explodefull_slug(即为full_slug的每个组件创建一行),然后按组件分组。SQLite本身没有这个功能,所以这个问题也可能通过递归来解决。

WITH
tree AS
(
SELECT
SUBSTR(full_slug || '/', 1, INSTR(full_slug || '/', '/')-1)   AS slug,
SUBSTR(full_slug || '/',    INSTR(full_slug || '/', '/')+1)   AS path,
score
FROM
ecosystems
UNION ALL
SELECT
SUBSTR(path, 1, INSTR(path, '/')-1)   AS slug,
SUBSTR(path,    INSTR(path, '/')+1)   AS path,
score
FROM
tree
WHERE
tree.path <> ''
)
SELECT
slug,
SUM(score)    AS total_score
FROM
tree
GROUP BY
slug
ORDER BY
slug

三种方法的演示:

  • https://dbfiddle.uk/?rdbms=sqlite_3.27&小提琴= 4 e535c238823843c3c27cb8ec617c974

最新更新