我有一些json数据,其中包括一个属性"字符",它看起来像这样:
select json_data['characters'] from latest_snapshot_events
回报:[{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":60,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":10,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":3},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":50,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":39,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":2},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":80,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":6801450488388220,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":1,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":85,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":8355588830097610,"shards":0,"CHAR_TPIECES":5,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4}]
这将在单行上返回。我希望数组中的每个项目都有一行。
我发现了几个SO帖子和其他博客,建议我使用unnest()
。我已经尝试了几次,但无法返回结果。例如,这是来自 presto 的文档。底部盖子作为蜂巢侧视图爆炸的替身展开:
SELECT student, score
FROM tests
CROSS JOIN UNNEST(scores) AS t (score);
所以我试图将其应用于我的表格:
characters as (
select
jdata.characters
from latest_snapshot_events
cross join unnest(json_data) as t(jdata)
)
select * from characters;
其中json_data
是latest_snapshot_events中包含属性"字符"的字段,该属性是一个类似于上面所示的数组。
这将返回一个错误:
[Simba]AthenaJDBC AWS Athena 客户端引发错误。SYNTAX_ERROR:第 69:12 行:列别名列表有 1 个条目,但"t"有 2 列可用
如何将latest_snapshot_events.json_data['characters']
取消嵌套/分解到多行上?
由于characters
是文本表示中的JSON数组,因此您必须:
- 使用
json_parse
分析 JSON 文本以生成 JSON 类型的值。 - 使用
CAST
将 JSON 值转换为 SQL 数组。 - 使用
UNNEST
分解数组。
例如:
WITH data(characters) AS (
VALUES '[{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":60,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":10,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":3},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":50,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":39,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":2},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":80,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":6801450488388220,"shards":0,"CHAR_TPIECES":0,"CHAR_A5_LVL":1,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4},{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":85,"CHAR_A3_LVL":1,"CHAR_TIER":1,"ITEM":8355588830097610,"shards":0,"CHAR_TPIECES":5,"CHAR_A5_LVL":0,"CHAR_A2_LVL":1,"CHAR_A4_LVL":1,"ITEM_CATEGORY":"Character","ITEM_LEVEL":4}]'
)
SELECT entry
FROM data, UNNEST(CAST(json_parse(characters) AS array(json))) t(entry)
它产生:
entry
-----------------------------------------------------------------------
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":60,"CHAR_A3_LVL":1,...
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":50,"CHAR_A3_LVL":1,...
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":80,"CHAR_A3_LVL":1,...
{"CHAR_STARS":1,"CHAR_A1_LVL":1,"ITEM_POWER":85,"CHAR_A3_LVL":1,...
在上面的示例中,我将 JSON 值转换为array(json)
,但是 您可以进一步将其转换为更具体的内容,如果每个中的值 数组条目具有常规架构。例如,对于您的数据,它是 可以将其转换为array(map(varchar, json))
,因为 数组是一个 JSON 对象。
如果您的初始数据是 JSON 字符串,则json_parse
工作。但是,对于array(row)
类型(即对象/字典数组),转换为array(json)
会将每一行转换为数组,从对象中删除所有键并阻止您使用点表示法或json_extract
函数。
要取消嵌套array(row)
数据,语法要简单得多:
CROSS JOIN UNNEST(my_array) AS my_row
我在尝试取消透视数据时遇到了此错误。
这可能会对某人有所帮助:
SELECT a_col, b_col
FROM
(
SELECT MAP(
ARRAY['a', 'b', 'c', 'd'],
ARRAY[1, 2, 3, 4]
) my_col
) CROSS JOIN UNNEST(my_col) as t(a_col, b_col)
t() 允许您定义多个列作为输出。