如何在 BigQuery 中显示多个分层树结构



我正在研究主管及其受监督员工的树状层次结构。困难在于,有些主管是受其他主管监督的员工,而且很多。

对于我从类中获得的SQL查询,仅涉及简单的自连接,这可能只有两个级别:A由B监督,仅此而已。

但来自现实世界的问题要复杂得多。有多个级别,我不确定确切的数字。例如,A 由 B 监督,B 由 C 监督,C 由 D 监督,等等。

我假设只有 5 个或更多级别的监督。原始数据可能如下所示:

Employee     Supervisor
A             B
C             B
D             B
B             V
E             V
F             E
G             V
V          (Blank which indicates no boss)
H             A

一些 BigQuery 专家提供的代码如下:

#standardSQL
SELECT t.Supervisor,
IF(t.Supervisor = t5.Supervisor, 
STRUCT(Employee2 AS Employee1, NULL AS Employee2),
STRUCT(t5.Supervisor AS Employee1, Employee2 AS Employee2)
).*
FROM (
SELECT t1.Employee Supervisor,
COALESCE(t4.Employee, t3.Employee, t2.Employee) Employee2
FROM `project.dataset.table` t1
LEFT JOIN `project.dataset.table` t2 ON t2.Supervisor = t1.Employee
LEFT JOIN `project.dataset.table` t3 ON t3.Supervisor = t2.Employee
LEFT JOIN `project.dataset.table` t4 ON t4.Supervisor = t3.Employee
WHERE t1.Supervisor IS NULL
) t
LEFT JOIN `project.dataset.table` t5 ON t5.Employee = t.Employee2

结果变成了这样:

Row Supervisor  Employee1   Employee2    
1   V           B           A    
2   V           B           C    
3   V           B           D    
4   V           E           F    
5   V           G           null  

但我们想要的是:

Row Supervisor  Employee1   Employee2  Employee3  
1   V           B           A          H
2   V           B           C         Null
3   V           B           D         Null
4   V           E           F         Null
5   V           G           null      Null

那么,如果我想拥有更多级别的层次结构,如何更改代码? 这意味着如果我想添加 employee3 或 4,我该如何编辑它?谢谢!

下面是 BigQuery Standard SQL

#standardSQL
WITH e0 AS (
SELECT Employee AS Supervisor FROM `project.dataset.table` WHERE Supervisor IS NULL
), e1 AS (
SELECT e.Supervisor, Employee AS Employee1 
FROM e0 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Supervisor
), e2 AS (
SELECT e.Supervisor, Employee1, Employee AS Employee2
FROM e1 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee1
), e3 AS (
SELECT e.Supervisor, Employee1, Employee2, Employee AS Employee3
FROM e2 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee2
)
SELECT * FROM e3   

如果应用于问题中的样本数据 - 结果/输出为

Row Supervisor  Employee1   Employee2   Employee3    
1   V           B           A           H    
2   V           B           C           null     
3   V           B           D           null     
4   V           E           F           null     
5   V           G           null        null       

您可以轻松地扩展上面添加更多级别,如下所示(替换并替换为相应的数字,如 4、5、6、7 等(显然达到合理的扩展

e<N> AS (
SELECT e.Supervisor, Employee1, Employee2, Employee3, ... , Employee AS Employee<N>
FROM e<N-1> e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee<N-1>
)   
SELECT * FROM e<N>     

例如

#standardSQL
WITH e0 AS (
SELECT Employee AS Supervisor FROM `project.dataset.table` WHERE Supervisor IS NULL
), e1 AS (
SELECT e.Supervisor, Employee AS Employee1 
FROM e0 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Supervisor
), e2 AS (
SELECT e.Supervisor, Employee1, Employee AS Employee2
FROM e1 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee1
), e3 AS (
SELECT e.Supervisor, Employee1, Employee2, Employee AS Employee3
FROM e2 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee2
), e4 AS (
SELECT e.Supervisor, Employee1, Employee2, Employee3, Employee AS Employee4
FROM e3 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee3
), e5 AS (
SELECT e.Supervisor, Employee1, Employee2, Employee3, Employee4, Employee AS Employee5
FROM e4 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee4
)
SELECT * FROM e5

最新更新