为什么我的子查询适用于字符串,而不是字段引用



我有一个(我认为是(相当复杂的查询。查询获取我想要的记录,然后获取第一个响应中引用的所有数据。如果我的子查询条件是字符串,它有效,但如果它是一个字段(具有完全相同的值(,则无效。

// Query with string as conditional in lowest sub-query (4th line from the bottom)
SELECT 
     e1.entity as entity
    ,ARRAY_CAT(
      ARRAY_COMPACT( 
        ARRAY_CONSTRUCT( 
          any_value(e2.entity), 
          any_value(u1.user) 
        )
      )
      ,ARRAY_AGG(e3.entity)
    ) as includes
FROM ENTITIES e1
LEFT JOIN ENTITIES e2 ON e1.entity:owner:workspace = e2.entity:id
LEFT JOIN USERS u1 ON e1.entity:owner:user = u1.user:id
LEFT JOIN ENTITIES e3 ON e3.entity:id IN (
  SELECT ee2.value FROM 
  table(FLATTEN( input=> 
    SELECT SPLIT(LISTAGG( CASE WHEN IS_ARRAY(ee1.value:id) THEN ARRAY_TO_STRING(ee1.value:id, ',') ELSE ee1.value:id END, ','), ',') 
    FROM table(FLATTEN( input => ( SELECT e4.entity:relationships:entities FROM ENTITIES e4 WHERE e4.entity:id = 'bd265f29-ca32-449a-b765-bb488e4d6b3c' ) )) ee1
  )) ee2
)
GROUP BY e1.entity

以上产生:

"实体"列:https://jsonblob.com/6d98b587-8989-11e9-b738-a9487a0dac0b

"包括"列:https://jsonblob.com/068a8672-8988-11e9-b738-77f0e471310b

但是,如果我将uuid字符串(bd265f29-ca32-449a-b765-bb488e4d6b3c(更改为e1.entity:id(如下(,则会出现错误SQL compilation error: Unsupported subquery type cannot be evaluated

SELECT 
     e1.entity as entity
    ,ARRAY_CAT(
      ARRAY_COMPACT( 
        ARRAY_CONSTRUCT( 
          any_value(e2.entity), 
          any_value(u1.user) 
        )
      )
      ,ARRAY_AGG(e3.entity)
    ) as includes
FROM ENTITIES e1
LEFT JOIN ENTITIES e2 ON e1.entity:owner:workspace = e2.entity:id
LEFT JOIN USERS u1 ON e1.entity:owner:user = u1.user:id
LEFT JOIN ENTITIES e3 ON e3.entity:id IN (
  SELECT ee2.value FROM 
  table(FLATTEN( input=> 
    SELECT SPLIT(LISTAGG( CASE WHEN IS_ARRAY(ee1.value:id) THEN ARRAY_TO_STRING(ee1.value:id, ',') ELSE ee1.value:id END, ','), ',') 
    FROM table(FLATTEN( input => ( SELECT e4.entity:relationships:entities FROM ENTITIES e4 WHERE e4.entity:id = e1.entity:id ) )) ee1
  )) ee2
)
GROUP BY e1.entity

我不知道为什么开关导致错误。为什么我的子查询使用字符串,而不是字段引用?

因此,使用几个CTE来提供数据,何时可以完成相关子查询的大部分提升。我将两种形式的事物数组放在实体中,以及具有多个 id 的单个实体,如您的 FLATTEN 用法所示:

WITH users AS (
    SELECT parse_json('{"id":1}') as user
), entities AS (
    SELECT parse_json(column1) as entity 
    FROM VALUES
        ('{"id":10, "relationships":{"entities":[{"id":11},{"id":12}]}, "owner":{"user":1,"workspace":10}}'),
        ('{"id":11, "relationships":{"entities":[{"id":11}]}}'),
        ('{"id":12, "relationships":{"entities":[{"id":[10,11]}]}}')
), ent1 AS (
    SELECT e4.entity:id as ent_id
        ,ee1.index
        ,SPLIT(LISTAGG( IFF( IS_ARRAY(ee1.value:id), ARRAY_TO_STRING(ee1.value:id, ','), ee1.value:id), ','), ',') as vals
    FROM ENTITIES AS e4, 
    TABLE(FLATTEN( input => e4.entity:relationships:entities )) ee1
    GROUP BY 1,2
), ent_rels AS (
    SELECT ent_id, ee2.value::number as rel_id
    FROM ent1 ee1,
    TABLE(FLATTEN( input => ee1.vals)) ee2
)
SELECT 
    e1.entity:id as entity
    ,e2.entity:id as e2_entity
    ,u1.user:id as u1_user
    ,e3.entity:id as e3_entity
FROM ENTITIES e1
LEFT JOIN ENTITIES e2 ON e1.entity:owner:workspace = e2.entity:id
LEFT JOIN USERS u1 ON e1.entity:owner:user = u1.user:id
LEFT JOIN ent_rels er ON er.ent_id = e1.entity:id
LEFT JOIN ENTITIES e3 ON e3.entity:id = er.rel_id
ORDER BY e1.entity:id;

因此,此SQL不是您选择的结果,但确实显示内容按预期加入。

ENTITY    E2_ENTITY    U1_USER    E3_ENTITY
10        10           1          11
10        10           1          12
11        null         null       11
12        null         null       10
12        null         null       11

所以这个最终的选择是你最初的方式

SELECT 
     e1.entity as entity
    ,ARRAY_CAT(
      ARRAY_COMPACT( 
        ARRAY_CONSTRUCT( 
          any_value(e2.entity), 
          any_value(u1.user) 
        )
      )
      ,ARRAY_AGG(e3.entity)
    ) as includes
FROM ENTITIES e1
LEFT JOIN ENTITIES e2 ON e1.entity:owner:workspace = e2.entity:id
LEFT JOIN USERS u1 ON e1.entity:owner:user = u1.user:id
LEFT JOIN ent_rels er ON er.ent_id = e1.entity:id
LEFT JOIN ENTITIES e3 ON e3.entity:id = er.rel_id
GROUP BY e1.entity
ORDER BY e1.entity:id;

此外,鉴于您正在撤消两层嵌套以获取匹配的id,您可以避免LISTAGG和SPLITS,并通过以下方式分解它们:

), ent1 AS (
    SELECT e4.entity:id as ent_id
        ,ee1.value:id as vals
    FROM ENTITIES AS e4, 
    TABLE(FLATTEN( input => e4.entity:relationships:entities )) ee1
), ent_rels AS (
    SELECT ent_id
        ,coalesce(ee2.value,ee1.vals) as rel_id
    FROM ent1 ee1,
    TABLE(FLATTEN( input => ee1.vals, outer => true)) ee2
)

如果您的偏好,可以合并/嵌套:

, ent_rels AS (
    SELECT ent_id
        ,coalesce(ee3.value,ee2.vals) as rel_id
    FROM (
      SELECT e1.entity:id as ent_id
        ,ee1.value:id as vals
        FROM ENTITIES AS e1, 
        TABLE(FLATTEN( input => e1.entity:relationships:entities )) ee1
    ) ee2,
    TABLE(FLATTEN( input => ee2.vals, outer => true)) ee3
)

关于子查询的 Snowflake 文档包含以下限制:

相关标量子查询目前仅当可以静态确定为返回一行时才受支持(例如,如果 SELECT 列表包含没有 GROUP BY 的聚合函数(。

因此,您可以尝试:

( SELECT MAX(e4.entity:relationships:entities)
  FROM ENTITIES e4
  WHERE e4.entity:id = e1.entity:id
)

你试过这样投吗?

e1.entity:id::string

Snowflake 文档提到:

FLATTEN 内部具有相关性的子查询当前是 支持。

您不能简单地使用 e1.entity:relations:entities 而不是子查询吗?

最新更新