我需要将一些销售记录导入并解析到数据库中。
有些客户有不止一种产品,因此在JSON
Array
中有几个产品不同但收货地址相同的订单。
当我试图将地址保存在Addresses
表中时出现问题:
显然,我只需要每个地址的一条记录,为了实现这一点,我计算了地址字段的Hash
,并将其与表中已经存在的哈希进行比较:
看起来(好吧,我敢肯定),查询检查只有第一次,如果Hash
已经存在,如果它不存在,它添加尽可能多的行作为订单计数:
我可以想象这是OPENJSON
函数的标准行为,因为JSON
有效载荷的解析似乎是由内循环进行的,因此我认为我必须使用不同的方法....但是我不知道该用什么。
这里是JSON
有效载荷
declare @json nvarchar(max)=N'[
{
"id": 21660,
"currency": "USD",
"total": "15.00",
"shipping": {
"first_name": "Charles",
"last_name": "Leuschke",
"address_1": "3121 W Olive Ave",
"city": "Burbank",
"state": "CA",
"postcode": "91505",
"country": "US"
},
"line_items": [
{
"id": 1052
}
]
},
{
"id": 21659,
"currency": "USD",
"total": "38.00",
"shipping": {
"first_name": "Charles",
"last_name": "Leuschke",
"address_1": "3121 W Olive Ave",
"city": "Burbank",
"state": "CA",
"postcode": "91505",
"country": "US"
},
"line_items": [
{
"id": 1050
}
]
},
{
"id": 21658,
"currency": "USD",
"total": "38.00",
"shipping": {
"first_name": "Charles",
"last_name": "Leuschke",
"address_1": "3121 W Olive Ave",
"city": "Burbank",
"state": "CA",
"postcode": "91505",
"country": "US"
},
"line_items": [
{
"id": 1048
}
]
}
]'
和(简化的)query
Insert Into @Addresses
(
orderId,
fullName,
addressLine1,
city,
stateOrProvince,
postalCode,
countryCode,
addressCode
)
SELECT
o.orderId,
concat(s.firstName,' ',s.lastName),
s.addressLine1,
s.city,
s.stateOrProvince,
s.postalCode,
s.countryCode,
convert(nvarchar(64),hashbytes('SHA1',concat(s.firstName, ' ', s.lastName, s.addressLine1, s.city, s.stateOrProvince, s.postalCode, s.countryCode)),2)
FROM OPENJSON(@json)
WITH (
orderId nvarchar(64) '$.id',
shipping nvarchar(max) '$.shipping' AS JSON
) o
CROSS APPLY OPENJSON(shipping)
WITH (
firstName nvarchar(128) '$.first_name',
lastName nvarchar(128) '$.last_name',
addressLine1 nvarchar(128) '$.address_1',
city nvarchar(128) '$.city',
stateOrProvince nvarchar(64) '$.state',
postalCode nvarchar(64) '$.postcode',
countryCode nvarchar(4) '$.country'
) s
left join @Addresses a on a.addressCode=convert(nvarchar(64),hashbytes('SHA1',concat(s.firstName,' ', s.lastName, s.addressLine1, s.city, s.stateOrProvince, s.postalCode, s.countryCode)),2)
where a.addressCode is null
我还准备了一个sqlfiddle,您可以看到它返回3行,而目标是只获得一个
https://dbfiddle.uk/?rdbms=sqlserver_2019&小提琴= 3031 a99e3cd24f6bf383c0de29cf19a1
请使用标准的ROW_NUMBER()
重复数据删除方法,例如
WITH cte AS (
SELECT
o.orderId,
concat(s.firstName,' ',s.lastName) fullName,
s.addressLine1,
s.city,
s.stateOrProvince,
s.postalCode,
s.countryCode,
h.addressCode,
ROW_NUMBER() OVER (PARTITION BY h.addressCode ORDER BY o.OrderId ASC) rn
FROM OPENJSON(@json)
WITH (
orderId nvarchar(64) '$.id',
shipping nvarchar(max) '$.shipping' AS JSON
) o
CROSS APPLY OPENJSON(shipping)
WITH (
firstName nvarchar(128) '$.first_name',
lastName nvarchar(128) '$.last_name',
addressLine1 nvarchar(128) '$.address_1',
city nvarchar(128) '$.city',
stateOrProvince nvarchar(64) '$.state',
postalCode nvarchar(64) '$.postcode',
countryCode nvarchar(4) '$.country'
) s
CROSS APPLY (
VALUES (CONVERT(nvarchar(64), HASHBYTES('SHA1', CONCAT(s.firstName ,' ', s.lastName, s.addressLine1, s.city, s.stateOrProvince, s.postalCode, s.countryCode)),2))
) h (addressCode)
LEFT JOIN @Addresses a ON a.addressCode = h.AddressCode
WHERE a.addressCode IS NULL
)
INSERT INTO @Addresses
(
orderId,
fullName,
addressLine1,
city,
stateOrProvince,
postalCode,
countryCode,
addressCode
)
SELECT orderId,
fullName,
addressLine1,
city,
stateOrProvince,
postalCode,
countryCode,
addressCode
FROM cte
WHERE rn = 1;
注意:如果您使用CROSS APPLY
来计算哈希码,可以避免多次计算。