在查询中使用 EXPLAIN 时,我应该从哪里开始寻找优化?



我有一个查询,使用 3MB 的 3MB work_mem在 postgres 中需要 512 个小时。

查询:

SELECT
nextval(
'rpro_vistas_hd.seq_ventas_inventario_cod'
) AS COD,
MAX (tmpsem.fecha) AS fecha_inventario,
MAX (inve.inventario_monto) AS MONTO_INVENTARIO,
MAX (inve.inventario_unidades) AS "UNIDADES_INVENTARIO",
CASE
WHEN SUM (venta.monto_venta) IS NULL THEN
0
ELSE
CAST (
SUM (venta.monto_venta) AS BIGINT
)
END AS "MONTO_VENTA",
CASE
WHEN SUM (venta.unidad_venta) IS NULL THEN
0
ELSE
CAST (
SUM (venta.unidad_venta) AS BIGINT
)
END AS "UNIDADES_VENTA",
ll.cod_local AS "COD_LOCAL",
prod.cod_plu AS "COD_PLU_PRODUCTO",
prod.cod_plu_retail AS "COD_PLU_RETAIL",
9999 :: NUMERIC AS "COD_PERIODO",
to_char(- 1, '99') AS historial_alerta,
CASE
WHEN SUM (venta.unidad_venta / 7) > 0 THEN
CASE
WHEN (
MAX (inve.inventario_unidades) / SUM (venta.unidad_venta / 7)
) IS NULL THEN
0
ELSE
CAST (
MAX (inve.inventario_unidades) / SUM (venta.unidad_venta / 7) AS INTEGER
)
END
ELSE
0
END AS rotacion_inventario,
CASE
WHEN (SUM(venta.unidad_venta) > 0) THEN
(
SUM (venta.monto_venta) / SUM (venta.unidad_venta)
) * 1.19
ELSE
((- 999999)) :: INTEGER
END AS precio_promedio,
bool_or(habi.habilitado_retailer) AS "HABILITADO_LOCAL",
SUM (
res.monto_ventas_ultimo_anio_movil
) AS "MONTO_VENTAS_ULTIMO_ANIO_MOVIL",
CASE
WHEN MAX (opp.oportunidad_mo) IS NULL THEN
0
ELSE
MAX (opp.oportunidad_mo)
END AS oportunidad_monto,
CASE
WHEN MAX (opp.oportunidad_un) IS NULL THEN
0
ELSE
MAX (opp.oportunidad_un)
END AS oportunidad_inventario
FROM
rpro_vistas_hd.perfil_1_clr_ultimo_inventario inve
INNER JOIN rpro_dim.dim_locales ll ON inve.cod_local = ll.cod_local
INNER JOIN rpro_dim.dim_productos_retail prod ON prod.cod_plu_retail = inve.cod_plu_retail
AND inve.cod_plu = prod.cod_plu
AND ll.cod_retailer = prod.cod_retailer
AND ll.cod_cadena = prod.cod_cadena
JOIN rpro_vistas_hd.perfil_1_clr_periodo_diario_y_semanal AS tmpsem ON tmpsem.fecha = inve.fecha_inventario
AND ll.cod_cadena = tmpsem.cod_cadena
LEFT JOIN rpro_vistas_hd.perfil_1_clr_ultima_oportunidad opp ON opp.cod_local = inve.cod_local
AND opp.cod_plu = inve.cod_plu
AND opp.cod_plu_retail = inve.cod_plu_retail
LEFT JOIN rpro_admin.rms_habilitacion AS habi ON habi.cod_plu_retail = prod.cod_plu_retail
AND habi.cod_plu = prod.cod_plu
AND habi.cod_local = ll.cod_local
LEFT JOIN rpro_vistas_hd.perfil_1_clr_resumen_producto_por_local_full AS res ON res.cod_local = ll.cod_local
AND res.cod_plu_retail = prod.cod_plu_retail
AND res.cod_plu_producto = prod.cod_plu
LEFT JOIN (
SELECT
vvdd.id_dim_local,
vvdd.id_dim_prod_retail,
SUM (
vvdd.sellout_dia_actual_mon
) AS monto_venta,
SUM (
vvdd.sellout_dia_actual_uni
) AS unidad_venta
FROM
rpro_dim.dim_ventas_diarias AS vvdd
JOIN rpro_dim.dim_tiempo_diario AS tmp ON tmp.id_dim_tiempo_diario = vvdd.id_dim_tiempo_diario
JOIN rpro_dim.dim_locales ll ON vvdd.id_dim_local = ll.id_dim_local
JOIN rpro_dim.dim_productos_retail pp ON vvdd.id_dim_prod_retail = pp.id_dim_prod_retail
RIGHT JOIN rpro_vistas_hd.perfil_1_clr_periodo_diario_y_semanal tmpsem ON vvdd.id_dim_tiempo_diario = tmpsem.id_dim_tiempo_diario
AND ll.cod_cadena = tmpsem.cod_cadena
GROUP BY
vvdd.id_dim_local,
vvdd.id_dim_prod_retail
) AS venta ON venta.id_dim_local = ll.id_dim_local
AND venta.id_dim_prod_retail = prod.id_dim_prod_retail
AND prod.cod_plu != 'NOCORRESPONDE'
GROUP BY
prod.cod_plu,
ll.cod_local,
prod.cod_plu_retail
ORDER BY
ll.cod_local,
prod.cod_plu_retail;

当我对这个查询执行解释时,我得到了这个:

GroupAggregate  (cost=694653.37..706072.47 rows=1 width=1515)
Group Key: ll.cod_local, prod.cod_plu_retail, prod.cod_plu
->  Nested Loop Left Join  (cost=694653.37..706072.34 rows=1 width=1451)
Join Filter: ((res.cod_local)::text = (ll.cod_local)::text)
->  Nested Loop Left Join  (cost=694652.95..706038.43 rows=1 width=1447)
Join Filter: (((opp.cod_local)::text = (inve.cod_local)::text) AND ((opp.cod_plu)::text = (inve.cod_plu)::text) AND ((opp.cod_plu_retail)::text = (inve.cod_plu_retail)::text))
->  Merge Join  (cost=694652.95..696760.40 rows=1 width=2633)
Merge Cond: (((ll.cod_local)::text = (inve.cod_local)::text) AND ((prod.cod_plu_retail)::text = (inve.cod_plu_retail)::text) AND ((prod.cod_plu)::text = (inve.cod_plu)::text))
Join Filter: (tmpsem.cod_cadena = ll.cod_cadena)
->  Merge Left Join  (cost=662750.31..662799.00 rows=4808 width=1323)
Merge Cond: (((ll.cod_local)::text = (habi.cod_local)::text) AND ((prod.cod_plu_retail)::text = (habi.cod_plu_retail)::text) AND ((prod.cod_plu)::text = (habi.cod_plu)::text))
->  Sort  (cost=662737.93..662749.95 rows=4808 width=1322)
Sort Key: ll.cod_local, prod.cod_plu_retail, prod.cod_plu
->  Hash Right Join  (cost=661432.51..662443.90 rows=4808 width=1322)
Hash Cond: ((vvdd.id_dim_local = ll.id_dim_local) AND (vvdd.id_dim_prod_retail = prod.id_dim_prod_retail))
Join Filter: ((prod.cod_plu)::text <> 'NOCORRESPONDE'::text)
->  GroupAggregate  (cost=652944.88..653562.94 rows=22475 width=72)
Group Key: vvdd.id_dim_local, vvdd.id_dim_prod_retail
->  Sort  (cost=652944.88..653001.07 rows=22475 width=48)
Sort Key: vvdd.id_dim_local, vvdd.id_dim_prod_retail
->  Nested Loop Left Join  (cost=1307.72..651320.38 rows=22475 width=48)
->  Seq Scan on perfil_1_clr_periodo_diario_y_semanal tmpsem_1  (cost=0.00..1.84 rows=84 width=8)
->  Nested Loop  (cost=1307.72..7751.11 rows=268 width=56)
->  Nested Loop  (cost=1307.43..7665.71 rows=268 width=56)
->  Hash Join  (cost=1307.15..7583.30 rows=268 width=56)
Hash Cond: (vvdd.id_dim_local = ll_1.id_dim_local)
Join Filter: (ll_1.cod_cadena = tmpsem_1.cod_cadena)
->  Bitmap Heap Scan on dim_ventas_diarias vvdd  (cost=900.95..6307.55 rows=53511 width=52)
      Recheck Cond: (id_dim_tiempo_diario = tmpsem_1.id_dim_tiempo_diario)
      ->  Bitmap Index Scan on dim_ventas_diarias_id_dim_tiempo_diario_idx  (cost=0.00..887.58 rows=53511 width=0)
            Index Cond: (id_dim_tiempo_diario = tmpsem_1.id_dim_tiempo_diario)
->  Hash  (cost=358.31..358.31 rows=3831 width=8)
      ->  Seq Scan on dim_locales ll_1  (cost=0.00..358.31 rows=3831 width=8)
->  Index Only Scan using dim_tiempo_diario_pkey on dim_tiempo_diario tmp  (cost=0.28..0.30 rows=1 width=4)
Index Cond: (id_dim_tiempo_diario = vvdd.id_dim_tiempo_diario)
->  Index Only Scan using id_dim_prod_retail_hist_pkey2 on dim_productos_retail pp  (cost=0.29..0.31 rows=1 width=4)
Index Cond: (id_dim_prod_retail = vvdd.id_dim_prod_retail)
->  Hash  (cost=8415.51..8415.51 rows=4808 width=1266)
->  Merge Join  (cost=7962.17..8415.51 rows=4808 width=1266)
Merge Cond: ((ll.cod_cadena = prod.cod_cadena) AND (ll.cod_retailer = prod.cod_retailer))
->  Sort  (cost=586.32..595.90 rows=3831 width=230)
Sort Key: ll.cod_cadena, ll.cod_retailer
->  Seq Scan on dim_locales ll  (cost=0.00..358.31 rows=3831 width=230)
->  Sort  (cost=7375.85..7501.36 rows=50204 width=1044)
Sort Key: prod.cod_cadena, prod.cod_retailer
->  Seq Scan on dim_productos_retail prod  (cost=0.00..3456.04 rows=50204 width=1044)
->  Sort  (cost=12.37..12.52 rows=60 width=1251)
Sort Key: habi.cod_local, habi.cod_plu_retail, habi.cod_plu
->  Seq Scan on rms_habilitacion habi  (cost=0.00..10.60 rows=60 width=1251)
->  Sort  (cost=31902.64..32407.94 rows=202119 width=1322)
Sort Key: inve.cod_local, inve.cod_plu_retail, inve.cod_plu
->  Hash Join  (cost=2.89..14091.06 rows=202119 width=1322)
Hash Cond: (inve.fecha_inventario = tmpsem.fecha)
->  Seq Scan on perfil_1_clr_ultimo_inventario inve  (cost=0.00..10262.35 rows=481235 width=1318)
->  Hash  (cost=1.84..1.84 rows=84 width=8)
->  Seq Scan on perfil_1_clr_periodo_diario_y_semanal tmpsem  (cost=0.00..1.84 rows=84 width=8)
->  Seq Scan on perfil_1_clr_ultima_oportunidad opp  (cost=0.00..4880.10 rows=251310 width=1612)
->  Index Scan using perfil_1_clr_resumen_producto_por_local_fu_cod_plu_producto_idx on perfil_1_clr_resumen_producto_por_local_full res  (cost=0.42..33.82 rows=8 width=1254)
Index Cond: ((cod_plu_producto)::text = (prod.cod_plu)::text)
Filter: ((cod_plu_retail)::text = (prod.cod_plu_retail)::text)

给定此查询计划,我应该从哪里开始进行优化?我正在考虑为此添加索引:

(((opp.cod_local)::text = (inve.cod_local)::text) AND ((opp.cod_plu)::text = (inve.cod_plu)::text) AND ((opp.cod_plu_retail)::text = (inve.cod_plu_retail)::text))

任何提示都非常感谢。很多表都没有索引,这就是为什么我需要知道使用 EXPLAIN 并使用不同的比较器将索引应用于列是否是一个很好的起点。或者也许应该努力优化查询?

您的问题是您的子查询(别名为venta(引用表rpro_vistas_hd,这意味着它是一个相关的子查询,这反过来意味着它针对前面连接的每一行执行

使问题更加复杂的是,您正确地连接到外部表,而右侧连接无论如何都表现不佳。

重写查询以不使用相关的子查询。

最新更新