将任何 JSON 读入 SQL Server 中的键值对列表(EAV 格式)



为了寻找一种在不知道SQL Server的JSON方法的内部结构的情况下读取任何JSON的方法,我想出了一个我想分享的方法。

这是提出这个问题的问题。

问题是:如何将未知的 JSON 转换为结构化 EAV 格式,同时保留有关排序顺序和嵌套级别的所有信息。

理想的输出应将原始行的 id 作为实体,将 JSON 的键和值作为属性,以及排序列表中特定对象的 JsonPath。

找到嵌入到我的自我答案中的MCVE(来自链接问题的示例数据(。

首先,我们创建一个声明的表变量,并用一些示例 JSON 填充它来模拟问题(我在示例中添加了一些数组以反映数组的 JSON 路径(:

DECLARE @table TABLE(ID INT IDENTITY, AnyJSON NVARCHAR(MAX));
INSERT INTO @table VALUES
(N' {
"correlationId": "c3xOeEEQQCCA9sEx7-u6FA",
"eventCreateTime": "2020-05-12T15:38:23.717Z",
"time": 1589297903717,
"owner": {
"ownergeography": {
"city": "abc",
"country": "abc"
},
"ownername": {
"firstname": "abc",
"lastname": "def"
},
"clientApiKey": "xxxxx",
"businessProfileApiKey": null,
"userId": null
},
"campaignType": "Mobile push"
}')
,(N'[{
"correlationIds": [
{
"campaignId": [1,2,3],
"correlationId": [{"a":"b"},{"c":"d"},{"e":"f"}]
}
],
"variantId": 1278915,
"utmCampaign": "",
"ua.os.major": "8"
}
,{
"correlationIds": [
{
"campaignId": [1,2,3],
"correlationId": [{"a":"b"},{"c":"d"},{"e":"f"}]
}
],
"variantId": 1278915,
"utmCampaign": "",
"ua.os.major": "8"
}]')
,(N'{
"correlationId": "ls7XmuuiThWzktUeewqgWg",
"eventCreateTime": "2020-05-12T12:40:20.786Z",
"time": 1589287220786,
"modifiedBy": {
"clientId": null,
"clientApiKey": "xxx",
"businessProfileApiKey": null,
"userId": null
},
"campaignType": "Mobile push"
}');

--查询

WITH recCTE AS
(
SELECT ID
,NestLevel   = 0 
,ObjectIndex = CAST(1 AS bigint)                                                          
,SortString  = CAST(N'sort'                       COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX)) 
,JsonPath    = CAST(N'$'                          COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
,JsonKey     = CAST(N'$'                          COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX)) 
,JsonValue   = CAST(AnyJSON                       COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX)) 
,JsonType    = CAST(CASE WHEN LEFT(TRIM(AnyJSON),1)=N'[' THEN 4 ELSE 0 END AS TINYINT)
,NestedJSON  = CAST(CASE WHEN ISJSON(AnyJSON)=1 
THEN AnyJSON 
ELSE NULL END            COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX)) 
FROM @table t
UNION ALL
SELECT r.ID
,r.NestLevel+1
,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) 
,CAST(CONCAT(r.SortString,REPLACE(STR(ROW_NUMBER() OVER(ORDER BY (SELECT NULL)),5),' ','0')) COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
,CAST(CONCAT(r.JsonPath, CASE WHEN r.JsonType=4 --<-- see the docs for OPENJSON()
THEN CONCAT('[',A.[key],']') 
ELSE '.' + A.[key] END)                       COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
,CAST(A.[key]                                                               COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
,CAST(r.JsonValue                                                           COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
,A.[type] 
,CAST(A.[value]                                                             COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
FROM recCTE r
CROSS APPLY OPENJSON(r.NestedJSON) A
WHERE ISJSON(r.NestedJSON)=1
)
SELECT ID
,NestLevel
,ObjectIndex
,JsonPath
,JsonKey
,NestedJSON AS JsonValue
,SortString --<-- just to illustrate the sorting, not needed in the output
FROM recCTE 
WHERE ISJSON(NestedJSON)=0
ORDER BY ID,SortString;

结果

+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| ID | JsonPath                                  | JsonKey         | JsonValue                | SortString                      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.correlationId                           | correlationId   | c3xOeEEQQCCA9sEx7-u6FA   | 0    1                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.eventCreateTime                         | eventCreateTime | 2020-05-12T15:38:23.717Z | 0    2                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.time                                    | time            | 1589297903717            | 0    3                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.owner.ownergeography.city               | city            | abc                      | 0    4    1    1                |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.owner.ownergeography.country            | country         | abc                      | 0    4    1    2                |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.owner.ownername.firstname               | firstname       | abc                      | 0    4    2    1                |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.owner.ownername.lastname                | lastname        | def                      | 0    4    2    2                |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.owner.clientApiKey                      | clientApiKey    | xxxxx                    | 0    4    3                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.campaignType                            | campaignType    | Mobile push              | 0    5                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].correlationIds[0].campaignId[0]      | 0               | 1                        | 0    1    1    1    1    1      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].correlationIds[0].campaignId[1]      | 1               | 2                        | 0    1    1    1    1    2      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].correlationIds[0].campaignId[2]      | 2               | 3                        | 0    1    1    1    1    3      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].correlationIds[0].correlationId[0].a | a               | b                        | 0    1    1    1    2    1    1 |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].correlationIds[0].correlationId[1].c | c               | d                        | 0    1    1    1    2    2    1 |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].correlationIds[0].correlationId[2].e | e               | f                        | 0    1    1    1    2    3    1 |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].variantId                            | variantId       | 1278915                  | 0    1    2                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].utmCampaign                          | utmCampaign     |                          | 0    1    3                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].ua.os.major                          | ua.os.major     | 8                        | 0    1    4                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].correlationIds[0].campaignId[0]      | 0               | 1                        | 0    2    1    1    1    1      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].correlationIds[0].campaignId[1]      | 1               | 2                        | 0    2    1    1    1    2      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].correlationIds[0].campaignId[2]      | 2               | 3                        | 0    2    1    1    1    3      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].correlationIds[0].correlationId[0].a | a               | b                        | 0    2    1    1    2    1    1 |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].correlationIds[0].correlationId[1].c | c               | d                        | 0    2    1    1    2    2    1 |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].correlationIds[0].correlationId[2].e | e               | f                        | 0    2    1    1    2    3    1 |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].variantId                            | variantId       | 1278915                  | 0    2    2                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].utmCampaign                          | utmCampaign     |                          | 0    2    3                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].ua.os.major                          | ua.os.major     | 8                        | 0    2    4                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 3  | $.correlationId                           | correlationId   | ls7XmuuiThWzktUeewqgWg   | 0    1                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 3  | $.eventCreateTime                         | eventCreateTime | 2020-05-12T12:40:20.786Z | 0    2                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 3  | $.time                                    | time            | 1589287220786            | 0    3                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 3  | $.modifiedBy.clientApiKey                 | clientApiKey    | xxx                      | 0    4    2                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 3  | $.campaignType                            | campaignType    | Mobile push              | 0    5                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+

简而言之,这个想法:

  • 我们使用递归 CTE 来解决这个问题。
  • 该查询将测试任何片段(来自OPENJSON的片段[value](是否为有效的JSON。
  • 如果片段有效,这就会走得越来越深。
  • 需要列SortString才能获得最终排序顺序。
  • CAST()COLLATE有助于避免数据类型不匹配。递归 CTE 对此非常挑剔...

提示:如果您处理较大的 JSON,则可能需要在查询结束时设置OPTION (MAXRECURSION 0)

享受 :-(

类似于 XML 的东西

下面是有关如何读取未知 XML 的类似答案。

相关内容

最新更新