如何创建包含 JSON 的列,该 JSON 的名称由另一个表中的列的值定义?



我有一个源表,其中包含 VARCHAR 格式的数据,如下例所示。 我想以 JSON 格式将数据插入另一个表中(结果列本身可以是 JSON 或 VARCHAR 类型(。

  • 对于每个 Id,至少有 1 个 JSONName/JSONValue 对。
  • 但是每个 ID 没有相同种类和数量的 JSONName/JSONValue 对。
  • 每个 ID 最多可以有 50 个 JSONName/JSONValue 对。
  • 结果 JSON 列值中对的顺序无关紧要。

SourceTable:
____________________________
| Id | JSONName | JSONValue |
|____|__________|___________|
| 1  | Name     | John      |
| 2  | Name     | Henry     |
| 2  | Age      | 32        |
| 3  | Age      | 56        |
| 3  | Location | US        |
| 4  | Age      | 24        |
| 4  | Name     | Andrew    |
| 4  | Location |           |

我想要什么:

Expected ResultTable:
____________________________________________________
| Id |               ResultJSON                     |
|____|______________________________________________|
| 1  | {"Name":"John"}                              |
| 2  | {"Name":"Henry","Age":"32"}                  |
| 3  | {"Age":"56", "Location":"US"}                |
| 4  | {"Age":"24","Name":"Andrew","Location":null} |

我通过当前查询得到什么:

Wrong resultTable:
_______________________________________________________________________________________________________________________________
| Id |               ResultJSON                                                                                                |
|____|_________________________________________________________________________________________________________________________|
| 1  | [{"JSONName":"Name","JSONValue":"John"}]                                                                                |
| 2  | [{"JSONName":"Name","JSONValue":"Henry"},{"JSONName":"Age","JSONValue":"32"}]                                           |
| 3  | [{"JSONName":"Age","JSONValue":"56"},{"JSONName":"Location","JSONValue":"US"}]                                          |
| 4  | [{"JSONName":"Age","JSONValue":"24"},{"JSONName":"Name","JSONValue":"Andrew"},{"JSONName":"Location","JSONValue":null}] |

当前查询:

INSERT INTO ResultTable
(
Id
,ResultJSON
)
SELECT
SourceTable.Id
,JSON_AGG(SourceTable.JSONName,SourceTable.JSONValue)
FROM SourceTable
INNER JOIN OtherTable ON SourceTable.Id=OtherTable.Id

是否可以使用 Teradata JSON 函数来做到这一点?如果不是,最优化的查询是什么?

您可以使用正则表达式删除不需要的部分:

SELECT
SourceTable.Id
,RegExp_Replace(Cast(Json_Agg(SourceTable.JSONName AS "#A",SourceTable.JSONValue AS "#B") AS VARCHAR(32000)), '"#A":|,"#B"|^[|]$|}(?=,{")|(?<="},){')
FROM SourceTable
GROUP BY 1

正则表达式删除了以下所有内容:

  • "#A":
  • ,"#B"
  • 领先的[
  • 尾随]
  • }是否后跟,{"
  • {是否遵循"},

编辑:

根据评论,此正则表达式留下了多余的左括号。这似乎效果更好:

'"#A":|,"#B"|^[|]$|}(?=,)|(?<=,){'

这是我最后得到的查询:

INSERT INTO DB.RESULT_TABLE
(
ResultId
,ResultJSON
)
WITH RECURSIVE MergedTable (Id, mergedList, rnk)
AS
(
SELECT
Id
,TRIM('"' || JSONName ||'":'|| COALESCE('"' || JSONValue || '"','null')) AS mergedList
,rnk
FROM DB.SOURCE_TABLE
WHERE rnk = 1
UNION ALL
SELECT
SourceTable.Id
,MergedTable.mergedList || ',' || TRIM('"' || SourceTable.JSONName ||'":' || COALESCE('"' || SourceTable.JSONValue || '"','null')) AS mergedList
,SourceTable.rnk
FROM DB.SOURCE_TABLE SourceTable
INNER JOIN MergedTable MergedTable
ON MergedTable.rnk + 1 = SourceTable.rnk
AND  SourceTable.Id = MergedTable.Id
)
SELECT
MergedTable.Id                       AS ResultId
,'{' || MergedTable.mergedList || '}' AS ResultJSON
FROM MergedTable
QUALIFY RANK() OVER (PARTITION BY ResultId ORDER BY rnk DESC) = 1
;

最新更新