我有一个源表,其中包含 VARCHAR 格式的数据,如下例所示。 我想以 JSON 格式将数据插入另一个表中(结果列本身可以是 JSON 或 VARCHAR 类型(。
- 对于每个 Id,至少有 1 个 JSONName/JSONValue 对。
- 但是每个 ID 没有相同种类和数量的 JSONName/JSONValue 对。
- 每个 ID 最多可以有 50 个 JSONName/JSONValue 对。
- 结果 JSON 列值中对的顺序无关紧要。
SourceTable:
____________________________
| Id | JSONName | JSONValue |
|____|__________|___________|
| 1 | Name | John |
| 2 | Name | Henry |
| 2 | Age | 32 |
| 3 | Age | 56 |
| 3 | Location | US |
| 4 | Age | 24 |
| 4 | Name | Andrew |
| 4 | Location | |
我想要什么:
Expected ResultTable:
____________________________________________________
| Id | ResultJSON |
|____|______________________________________________|
| 1 | {"Name":"John"} |
| 2 | {"Name":"Henry","Age":"32"} |
| 3 | {"Age":"56", "Location":"US"} |
| 4 | {"Age":"24","Name":"Andrew","Location":null} |
我通过当前查询得到什么:
Wrong resultTable:
_______________________________________________________________________________________________________________________________
| Id | ResultJSON |
|____|_________________________________________________________________________________________________________________________|
| 1 | [{"JSONName":"Name","JSONValue":"John"}] |
| 2 | [{"JSONName":"Name","JSONValue":"Henry"},{"JSONName":"Age","JSONValue":"32"}] |
| 3 | [{"JSONName":"Age","JSONValue":"56"},{"JSONName":"Location","JSONValue":"US"}] |
| 4 | [{"JSONName":"Age","JSONValue":"24"},{"JSONName":"Name","JSONValue":"Andrew"},{"JSONName":"Location","JSONValue":null}] |
当前查询:
INSERT INTO ResultTable
(
Id
,ResultJSON
)
SELECT
SourceTable.Id
,JSON_AGG(SourceTable.JSONName,SourceTable.JSONValue)
FROM SourceTable
INNER JOIN OtherTable ON SourceTable.Id=OtherTable.Id
是否可以使用 Teradata JSON 函数来做到这一点?如果不是,最优化的查询是什么?
您可以使用正则表达式删除不需要的部分:
SELECT
SourceTable.Id
,RegExp_Replace(Cast(Json_Agg(SourceTable.JSONName AS "#A",SourceTable.JSONValue AS "#B") AS VARCHAR(32000)), '"#A":|,"#B"|^[|]$|}(?=,{")|(?<="},){')
FROM SourceTable
GROUP BY 1
正则表达式删除了以下所有内容:
"#A":
,"#B"
- 领先的
[
- 尾随
]
}
是否后跟,{"
{
是否遵循"},
编辑:
根据评论,此正则表达式留下了多余的左括号。这似乎效果更好:
'"#A":|,"#B"|^[|]$|}(?=,)|(?<=,){'
这是我最后得到的查询:
INSERT INTO DB.RESULT_TABLE
(
ResultId
,ResultJSON
)
WITH RECURSIVE MergedTable (Id, mergedList, rnk)
AS
(
SELECT
Id
,TRIM('"' || JSONName ||'":'|| COALESCE('"' || JSONValue || '"','null')) AS mergedList
,rnk
FROM DB.SOURCE_TABLE
WHERE rnk = 1
UNION ALL
SELECT
SourceTable.Id
,MergedTable.mergedList || ',' || TRIM('"' || SourceTable.JSONName ||'":' || COALESCE('"' || SourceTable.JSONValue || '"','null')) AS mergedList
,SourceTable.rnk
FROM DB.SOURCE_TABLE SourceTable
INNER JOIN MergedTable MergedTable
ON MergedTable.rnk + 1 = SourceTable.rnk
AND SourceTable.Id = MergedTable.Id
)
SELECT
MergedTable.Id AS ResultId
,'{' || MergedTable.mergedList || '}' AS ResultJSON
FROM MergedTable
QUALIFY RANK() OVER (PARTITION BY ResultId ORDER BY rnk DESC) = 1
;