我在一家服装公司工作,处理他们的订单数据。 订单数据分为 3 个数据库。包含其订单信息的ERP数据,如价格,数量等。 包含该订单的所有发货信息的发货数据。 以及一个 Shopify 数据库,其中包含使用 API 的 Shopify 的所有信息。
在过去的几个月里,我已经针对这三个表中的所有数据创建了一个全面的仪表板。 我一直在针对此数据编写存储过程,但问题是运行需要很长时间,因为我必须跨多个数据库联接多个表。
我想创建一个每天运行一到两次的作业,读取自上次运行以来收到的所有新订单,并将所有这些数据添加到一个数据库上的一个表中。 然后我所有的查询都将使用这个表,使一切运行得更快。
我还希望工作不仅要添加新订单,还要更新旧订单。假设我的表有"状态"字段。首次创建订单时,状态将为"已订购",但 2 天后它可能变为"已发货"。 而其他字段(例如Shipment_Number字段)在该信息更新之前将为 null。
有人可以帮我弄清楚如何做到这一点吗?
这是我的大型查询,它收集了我需要的所有数据:
SELECT DISTINCT
--TOP 10
O.Company_Code
,O.Division_Code
,O.Date_Entered
,O.Customer_Number
,O.Control_Number
,O.Customer_Purchase_Order_Number
,P.PickTicket_Number
,CASE
WHEN sh.packslip IS NULL
THEN ph.packslip
ELSE sh.packslip
END Packslip
,PST.Stage_Code Pickticket_Stage_Code
,I.Invoice_Number
,IST.Stage_Code Inovice_Stage_Code
,CASE
WHEN sh.packslip IS NULL
THEN z.status
ELSE 'SHIPPED & UPLOADED'
END Accellos_Status
,O.Order_Type_Code
--,O.Order_Status
,CASE
WHEN o.Order_Status = 30 THEN 'CANCELED'
WHEN o.Order_Status = 45 THEN 'SHIPPED'
WHEN o.Order_Status = 10 THEN 'NORMAL'
WHEN o.Order_Status = 20 THEN 'NORMAL'
END Order_Status
,OD.Odet_Line_Number
--,OD.Line_Status
,CASE
WHEN od.Line_Status = 90 THEN 'ALLOCATED'
WHEN od.Line_Status = 80 THEN 'CANCELED'
WHEN od.Line_Status = 70 THEN 'SHIPPED'
WHEN od.Line_Status = 50 THEN 'NORMAL'
END Line_Status
,S.Master_Item
,OD.Item_Number
,S.Description
,OD.Color_Code
,C.Color_Description
,OD.Quantity_Ordered
,OD.Quantity_Allocated
,OD.Quantity_Invoiced
,RT.Retail Unit_Retail
,(RT.Retail-OD.Price) Unit_MarkDown
,OD.Price Sell_Price
,OD.Discount_Value
,(OD.Price - OD.Discount_Value) Sale_Price
,((OD.Price - OD.Discount_Value)* (Royalty_Percentage/100)) Royalty_Cost
,CASE
WHEN ISNULL(s.Actual_Cost, 0) = 0
THEN ISNULL(s.Standard_Cost, 0)
ELSE s.Actual_Cost
END Item_Cost
,SR.Royalty_Code
,RC.Royalty_Percentage
,OD.Freight_Charges
--,CASE
-- WHEN sh.packslip is null
-- THEN ph.COST_SHIP
-- ELSE sh.COST_SHIP
--END Freight_Cost
,SM.Cost_Ship Master_Freight_Cost
,OD.Tax_Value
,CASE WHEN (OD.Quantity_Ordered <> OD.Quantity_Invoiced and OD.Quantity_Allocated=0) THEN OD.Quantity_Ordered - OD.Quantity_Invoiced ELSE 0 END ShortShippedAmount
,RD.Return_Number
,RD.Quantity_Returned
--,OW.Email
,CASE
WHEN AE.Email IS NULL -- Does it matter which one I do for then and for else? One is in the same database
THEN OW.Email
ELSE AE.Email
END Email
,CASE --Is there a better way to do this? When one table has the information and the other does't
WHEN sh.packslip IS NULL
THEN ph.ship_name
ELSE sh.SHIP_NAME
END Customer_Name
,CASE
WHEN sh.packslip IS NULL
THEN ph.SHIP_ADD1 + ' ' + ph.SHIP_ADD2
ELSE sh.SHIP_ADD1 + ' ' + sh.SHIP_ADD2
END Customer_Addr
,CASE
WHEN sh.packslip IS NULL
THEN ph.SHIP_CITY
ELSE sh.SHIP_CITY
END Customer_City
,CASE
WHEN sh.packslip is null
THEN ph.SHIP_PROV
ELSE sh.SHIP_PROV
END Customer_State
,CASE
WHEN sh.packslip is null
THEN ph.SHIP_ZIP
ELSE sh.SHIP_ZIP
END Customer_Zip
,o.Ship_Via_Code
,CASE
WHEN sh.packslip is null
THEN ph.SHIP_VIA
ELSE sh.SHIP_VIA
END Ship_Via
,CASE
WHEN sh.packslip is null
THEN ph.SERVICE
ELSE sh.SERVICE
END SERVICE
,CASE
WHEN sh.packslip is null
THEN ph.SHIP_NUM
ELSE sh.SHIP_NUM
END Ship_Num
,OW.Gateway
,OW.Tags
,OW.FinancialState
,OW.FulfillmentState
,OW.OrderCreated Shopify_Ordered
,OW.DateCreated Pro4Soft_Created
,OW.DownloadTime Pro4Soft_Downloaded
,OW.ConfirmedDownloadTime Pro4Soft_Confirmed
,FW.DateCreated Pro4Soft_Fulfilled
,OW.OrderUpdated Pro4Soft_Updated
,OW.OrderClosed Pro4Soft_Closed
,OW.OrderCancelled Pro4Soft_Cancelled
,OW.LastSynced Pro4Soft_Last_Synced
,OTSS.AMT_Created
,OTSS.AMT_Invoiced
,OTSS.AMT_Cancelled
,PTSS.Pickticket_Created
,PTSS.Pickticket_Allocated
,RLTSS.Accellos_Download
,RLTSS.Accellos_Allocated
,RLTSS.Accellos_Waved
,RLTSS.Accellos_Label
,RLTSS.Accellos_Last_Pick
,RLTSS.Accellos_Rating
,RLTSS.Accellos_Shipped
,RLTSS.Accellos_Upload
,CASE
WHEN sh.packslip IS NULL
THEN ph.Date_modfy
ELSE sh.Date_modfy
END Accellos_Timestamp
,SM.Tracktrace
--,*
--AMT Tables
FROM [JMNYC-AMTDB].[AMTPLUS].[dbo].Orders O WITH (NOLOCK)
LEFT JOIN [JMNYC-AMTDB].[AMTPLUS].[dbo].Order_Detail OD WITH (NOLOCK)
on O.Company_Code = OD.Company_Code
and O.Division_Code = OD.Division_Code
and O.Control_Number = OD.Control_Number
LEFT JOIN [JMNYC-AMTDB].[AMTPLUS].[dbo].PickTickets P WITH (NOLOCK)
on O.Company_Code = P.Company_Code
and O.Division_Code = P.Division_Code
and O.Control_Number = P.Control_Number
LEFT JOIN [JMNYC-AMTDB].[AMTPLUS].[dbo].PickTickets_Stage PST WITH (NOLOCK)
ON o.Company_Code=pst.Company_Code
AND o.Division_Code=pst.Division_Code
AND p.PickTicket_Number=pst.PickTicket_Number
LEFT JOIN [JMNYC-AMTDB].[AMTPLUS].[dbo].Invoices I WITH (NOLOCK)
on O.Company_Code = I.Company_Code
and O.Division_Code = I.Division_Code
and O.Control_Number = I.Control_Number
LEFT JOIN [JMNYC-AMTDB].[AMTPLUS].[dbo].Invoices_Detail ID WITH (NOLOCK)
on O.Company_Code = ID.Company_Code
and O.Division_Code = ID.Division_Code
and O.Control_Number = ID.Control_Number
LEFT JOIN [JMNYC-AMTDB].[AMTPLUS].[dbo].Invoices_Stage IST WITH (NOLOCK)
ON o.Company_Code = ist.Company_Code
AND o.Division_Code = ist.Division_Code
AND i.Invoice_Number = ist.Invoice_Number
LEFT JOIN [JMNYC-AMTDB].[AMTPLUS].[dbo].Style S WITH (NOLOCK)
on O.Company_Code = S.Company_Code
and O.Division_Code = S.Division_Code
and OD.Item_Number = S.Item_Number
and OD.Color_Code = S.Color_Code
LEFT JOIN [JMNYC-AMTDB].[AMTPLUS].[dbo].Color C WITH (NOLOCK)
on S.Color_Code = C.Color_Code
LEFT JOIN [JMNYC-AMTDB].[AMTPLUS].[dbo].Returns_Header RH WITH (NOLOCK)
on O.Company_Code = RH.Company_Code
and O.Division_Code = RH.Division_Code
and I.Invoice_Number = RH.Invoice_Number
and O.Customer_Purchase_Order_Number = RH.Customer_Purchase_Order_Number
LEFT JOIN [JMNYC-AMTDB].[AMTPLUS].[dbo].Returns_Detail RD WITH (NOLOCK)
on O.Company_Code = RD.Company_Code
and O.Division_Code = RD.Division_Code
and O.Control_Number = RD.Control_Number
and RH.Return_Number = RD.Return_Number
and OD.Item_Number = RD.Item_Number
and OD.Color_Code = RD.Color_Code
and OD.Odet_Line_Number = RD.Odet_Line_Number
LEFT JOIN [JMNYC-AMTDB].[AMTPLUS].[dbo].Style_Royalty SR WITH (NOLOCK)
on S.Company_Code = SR.Company_Code
and S.Division_Code = SR.Division_Code
and S.Item_Number = SR.Item_Number
and S.Color_Code = SR.Color_Code
LEFT JOIN [JMNYC-AMTDB].[AMTPLUS].[dbo].Royalty_By_Customer RC WITH (NOLOCK)
on S.Company_Code = RC.Company_Code
and S.Division_Code = RC.Division_Code
and SR.Royalty_Code = RC.Royalty_Code
LEFT JOIN [JMNYC-AMTDB].[AMTPLUS].[dbo].Z_N_FullRetailTable RT WITH (NOLOCK)
on OD.Item_Number = RT.SKU
and OD.Color_Code = RT.Color
LEFT JOIN [JMNYC-AMTDB].[AMTPLUS].[dbo].ArchivedEmails AE WITH (NOLOCK)
ON O.Customer_Purchase_Order_Number = AE.OrderNumber
--WMS Tables
LEFT JOIN [JMDNJ-ACCELSQL].[A1WAREHOUSE].[dbo].pickhead ph WITH (NOLOCK) --When the order is not shipped and invoiced yet it is in this table.
ON p.PickTicket_Number = ph.packslip
LEFT JOIN [JMDNJ-ACCELSQL].[A1WAREHOUSE].[dbo].SHIPHIST sh WITH (NOLOCK) --After order is Shipped it gets deleted from Pickhead and moves to this table.
ON o.Customer_Purchase_Order_Number = sh.cust_po
LEFT JOIN [JMDNJ-ACCELSQL].[A1WAREHOUSE].[dbo].SHIPMSTR2 SM WITH (NOLOCK)
ON o.Customer_Purchase_Order_Number = sm.Cust_Order
and sh.packslip=sm.packslip
LEFT JOIN [JMDNJ-ACCELSQL].[A1WAREHOUSE].[dbo].Z_Status z WITH (NOLOCK)
ON ph.PROCSTEP = z.procstep
--Shopify Tables
LEFT JOIN [SHOPIFY].[shopify_moret].[dbo].[OrderWrappers] OW WITH (NOLOCK)
ON O.Customer_Purchase_Order_Number = OW.OrderName
LEFT JOIN [SHOPIFY].[shopify_moret].[dbo].[FulfillmentWrappers] FW WITH (NOLOCK)
ON OW.Id =FW.OrderWrapperId
--Timestampt Tables (Mix of AMT and WMS)
LEFT JOIN (
SELECT
Company_Code
,Division_Code
,Control_Number
,max( case when Action like 'New%' then Date end) as AMT_Created
,max( case when Action like '%Invoice%' then Date end) as AMT_Invoiced
,max( case when Action like '%Cancel%' then Date end) as AMT_Cancelled
FROM(
SELECT DISTINCT
Company_Code
,Division_Code
,Control_Number
,Date
,Action
from [JMNYC-AMTDB].[AMTPLUS].[dbo].Order_Audit_Trail WITH (NOLOCK)
)OTS
group by Company_Code, Division_Code, Control_Number
)OTSS
on O.Control_Number = OTSS.Control_Number
and O.Company_Code = OTSS.Company_Code
and O.Division_Code = OTSS.Division_Code
LEFT JOIN (
SELECT
Company_Code
,Division_Code
,PickTicket_Number
,max( case when Action like 'Created%' then Date end) as Pickticket_Created
,max( case when Action like 'Inserted%' then Date end) as Pickticket_Allocated
,max( case when Action like 'Updated%' then Date end) as Pickticket_Updated
--,max( case when Action like '%Cancel%' then Date end) as Pickticket_Cancelled
FROM(
SELECT DISTINCT
Company_Code
,Division_Code
,PickTicket_Number
,Date
,Action
from [JMNYC-AMTDB].[AMTPLUS].[dbo].PickTickets_Audit_Trail WITH (NOLOCK)
)PTS
group by Company_Code, Division_Code, PickTicket_Number
)PTSS
on O.Company_Code = PTSS.Company_Code
and O.Division_Code = PTSS.Division_Code
and PTSS.PickTicket_Number = COALESCE( sh.packslip, P.PickTicket_Number)
LEFT JOIN (
SELECT
Packslip
,max( case when Action like 'DNLOAD' then Date_Time end) as Accellos_Download
,max( case when Action like 'ALLOC' then Date_Time end) as Accellos_Allocated
,max( case when Action like 'WAVEORDR' then Date_Time end) as Accellos_Waved
,max( case when Action like 'NEWLABEL' then Date_Time end) as Accellos_Label
,max( case when Action like 'EOL_LSTP' then Date_Time end) as Accellos_Last_Pick
,max( case when Action like 'RATED' then Date_Time end) as Accellos_Rating
,max( case when Action like 'SHIPPED' then Date_Time end) as Accellos_Shipped
,max( case when Action like 'UPLOAD' then Date_Time end) as Accellos_Upload
FROM(
SELECT DISTINCT
Packslip
,Date_Time
,Action
from [JMDNJ-ACCELSQL].[A1Warehouse].[dbo].[RF_LOG2] RL WITH (NOLOCK)
)RLTS
group by Packslip
)RLTSS
on RLTSS.PACKSLIP = Coalesce(sh.PACKSLIP, TRY_CAST(P.pickticket_number as nvarchar))
WHERE
--(O.Company_Code = '09' OR '09' IS NULL) AND
--(O.Division_Code = '001' OR '001' IS NULL) AND
O.Customer_Number = 'ecom2x' AND
ISNUMERIC(O.Customer_Purchase_Order_Number) <> 0
AND O.Order_Type_Code <> 'B'
AND O.Date_Entered BETWEEN '8/1/2019' AND DATEADD(dayofyear, 1, GETDATE())
ORDER BY O.Date_Entered DESC
如果我像这样运行此查询,过去一周的订单需要四个多小时,谁知道自时间开始以来所有订单需要多少小时。(尽管似乎添加行不会成倍增加时间)。
我认为将此数据插入表中并不太困难。我可以在查询之前添加一个插入语句,让它在一夜之间运行。
困难的部分是保持它的"活力">
我认为最好的方法是做一次大的插入语句。这会将大部分数据放入表中。
然后,我将创建一个运行此查询的作业,但仅限于过去 40 天(因为 40 天后,订单信息几乎不会更改)。
下面是一个示例结果:
Company_Code Division_Code Date_Entered Customer_Number Control_Number Customer_Purchase_Order_Number PickTicket_Number Packslip Pickticket_Stage_Code Invoice_Number Inovice_Stage_Code Accellos_Status Order_Type_Code Order_Status Odet_Line_Number Line_Status Master_Item Item_Number Description Color_Code Color_Description Quantity_Ordered Quantity_Allocated Quantity_Invoiced Unit_Retail Unit_MarkDown Sell_Price Discount_Value Sale_Price Royalty_Cost Item_Cost Royalty_Code Royalty_Percentage Freight_Charges Master_Freight_Cost Tax_Value ShortShippedAmount Return_Number Quantity_Returned Email Customer_Name Customer_Addr Customer_City Customer_State Customer_Zip Ship_Via_Code Ship_Via SERVICE Ship_Num Gateway Tags FinancialState FulfillmentState Shopify_Ordered Pro4Soft_Created Pro4Soft_Downloaded Pro4Soft_Confirmed Pro4Soft_Fulfilled Pro4Soft_Updated Pro4Soft_Closed Pro4Soft_Cancelled Pro4Soft_Last_Synced AMT_Created AMT_Invoiced AMT_Cancelled Pickticket_Created Pickticket_Allocated Accellos_Download Accellos_Allocated Accellos_Waved Accellos_Label Accellos_Last_Pick Accellos_Rating Accellos_Shipped Accellos_Upload Accellos_Timestamp Tracktrace
1 7 14:00.4 ECOM2X 511322 1721057 NULL 11422781 NULL 10704058 PRINTREADY SHIPPED & UPLOADED N SHIPPED 1 SHIPPED 1950 17DN19501XLEC PLUS SIZE BERMUDA SHORT 115 RICH BLACK 2 0 2 28 0 28 0 28 1.96 6.49 DANSKIN 7 0 7.71 0 0 NULL NULL wwright5333@yahoo.com Wendy WRIGHT 5596 NORTH RD FRIENDSVILLE PA 18818 SGRD USPS Priority Mail SHP1580902 shopify_payments Riskified::approved, Riskified::submitted paid fulfilled 2019-08-06 12:12:47.0000000 +00:00 2019-08-06 12:13:30.2019290 +00:00 2019-08-06 12:13:41.9336199 +00:00 2019-08-06 12:14:20.5300000 +00:00 2019-08-06 22:31:37.3958947 +00:00 2019-08-06 23:45:11.0000000 +00:00 2019-08-06 22:31:05.0000000 +00:00 NULL 2019-08-06 23:45:30.1978836 +00:00 14:00.5 30:09.6 NULL 14:01.0 NULL 20190806 08:17:23.67 20190806 08:19:43.82 20190806 08:19:56.43 20190806 08:19:57.18 20190806 11:18:57.31 20190806 12:50:32.00 20190806 15:57:50.87 20190806 15:58:01.76 57:50.8 9.40552E+21
这就是我需要帮助的。
如何创建一个过程/作业来比较表中过去 30-40 天的订单,并查看查询行中的新增内容。 然后,我可以每天运行此作业一次或两次,以更新所有时间戳和运输信息。
也许创建此作业的两个版本会更好?一个每天仅运行一次,用于更新所有时间戳信息,另一个运行一次,用于更新所有返回信息,用于更新所有返回信息。 还是这太复杂了?
作业需要检查每行是否有任何更改,以及是否有更新。如果没有,什么都不做。有没有办法优化它?就像一旦它看到行没有变化,它就会跳过它。
(另外,如果有人对查询有任何一般提示,我将不胜感激。我试图使它尽可能可读。
另外,像这样做所有的数学计算更好吗
,((OD.Price - OD.Discount_Value)* (Royalty_Percentage/100)) Royalty_Cost
在存储过程中,因此结果已经在我的表中,还是对我将针对此新表进行的查询进行计算更好?还是没关系?
我一直在针对此数据编写存储过程,但问题是运行需要很长时间,因为我必须跨多个数据库联接多个表。
"多个数据库"问题不是问题。 SQL Server数据库是管理和安全边界,但跨数据库选择/联接没有任何困难。 所有数据库统计信息都可用于查询优化器。
我想创建一个工作...将所有这些数据添加到一个数据库上的一个表中
小心你的愿望。 正如您已经预见的那样,维护该表将是有效的,并且永远不会是最新的。 即使在更新时,基表中的信息也可能更改。
具体化视图建议很好,但此查询的某些方面确实使查询优化器变得困难,从表数开始。
最简单的改进是使用合并而不是IF X IS NULL
构造。 这可能会对某些人有所帮助。
我假设您有适合每个连接的索引,并且您已确认它们正在被使用。 如果索引不是聚集索引,您可能会发现覆盖索引更有帮助。 (覆盖索引具有许多与实例化视图相同的优点。
您使用"时间戳表"的方式尤其成问题。 这个微不足道的错误是 SELECT DISTINCT 在结果聚合的子查询中(使用 GROUP BY)。 区分是不必要的;如果幸运的话,无论如何都会在优化过程中消除它。
更重要的是你为这样的结构付出的代价
,max( case when Action like 'New%' then Date end) as AMT_Created
因为系统必须对每一行进行模式匹配,不止一次。 如果需要将任何内容写入中间表,那就是这些查询。
左联接也会带来成本。 它们真的有必要吗? 也许您可以定期(可能每 15 分钟左右)检查一次跨数据库一致性,并在查询中使用内部联接。
还有两点:
- 确保服务器有足够的 RAM。 越多越好,尤其是对于联接。
- 努力完全理解查询优化器的报告,并确保每个存储过程都以最佳方式访问表。 一般来说,SQL Server 性能的解决方案是首先索引,其次是临时表,最后是其他所有内容。
每个人迟早都会试图用一张大"扁平"桌子来解决性能问题。 这是很多工作,而且很少有好结果。 最好隔离奇怪的麻烦制造者,并将数据库中的重复数据保持在最低限度。