选择最近的时间戳行并从具有Variant DataType的列中获取值



我希望这个标题有一些意义,如果我应该让它更可读,我愿意接受建议。

我在Snowflake有一个名为BI_table_temp的临时表。它有两列Load_DateTime,数据类型为Timestamp_LTZ(9(,以及JSON_DATA,后者是一种Variant数据类型,具有来自JSON文件的嵌套记录。我想查询这个表,然后计划将其吸收到另一个表中,但我想确保始终获得最新的Load_DateTime行。

我已经尝试过了,这很有效,但它向我显示了Load_DateTime列,我不希望这样,我只想从具有最大Load_DateTime时间戳的JSON_DATA行中获取值:

SELECT
MAX(Load_DateTime),
transactions.value:id::string as id
transactions.value:value2::string as account_value
transactions.value:value3::string as new_account_value
FROM BI_Table_Temp,
LATERAL FLATTEN (JSON_DATA:transactions) as transactions
GROUP BY transactions.value

一个简单的选项:


WITH data AS (
  SELECT Load_DateTime 
    , transactions.value:id::string as id
    , transactions.value:value2::string as account_value
    , transactions.value:value3::string as new_account_value
  FROM BI_Table_Temp,
  LATERAL FLATTEN (JSON_DATA:transactions) as transactions
), max_load AS (
  SELECT MAX(Load_DateTime) Load_DateTime, id
  FROM data, 
  GROUP BY id
)
SELECT transactions.value:id::string as id
    , transactions.value:value2::string as account_value
    , transactions.value:value3::string as new_account_value
FROM data
JOIN max_load
USING (id, Load_DateTime)

由于transactions.value是一个变体,我猜对于GROUP BY transactions.value,你真正的意思是GROUP BY transactions.value:id

最新更新