我有一个查询在关闭期间运行得很快,但当有负载时它运行得很慢。在New Relic中,它有时会运行5-8分钟。查询看起来很简单,但View定义可能没有那么简单。所以想知道是否有任何优化的范围数据库版本—"PostgreSQL 10.14 on x86_64-pc-linux-gnu, compiled by x86_64-unknown-linux-gnu-gcc (GCC) 4.9.4, 64-bit"
在任何监控工具中出现的查询是:
SELECT
esnpartvie0_.esn_id AS col_0_0_,
esnpartvie0_.esn AS col_1_0_,
esnpartvie0_.quarter_point AS col_2_0_,
esnpartvie0_.work_order_number AS col_3_0_,
esnpartvie0_.site AS col_4_0_,
sum(esnpartvie0_.critical) AS col_5_0_,
sum(esnpartvie0_.numshort) AS col_6_0_,
sum(esnpartvie0_.wa) AS col_7_0_,
esnpartvie0_.customer AS col_8_0_,
esnpartvie0_.adj_accum_date AS col_9_0_,
esnpartvie0_.g2_otr AS col_10_0_,
esnpartvie0_.induct_date AS col_11_0_,
min(esnpartvie0_.delta) AS col_12_0_,
esnpartvie0_.fiscal_week_bucket_date AS col_13_0_
FROM
moa.esn_part_view esnpartvie0_
WHERE
esnpartvie0_.esn_id = 140339
GROUP BY
esnpartvie0_.esn_id,
esnpartvie0_.esn,
esnpartvie0_.quarter_point,
esnpartvie0_.work_order_number,
esnpartvie0_.site,
esnpartvie0_.customer,
esnpartvie0_.adj_accum_date,
esnpartvie0_.g2_otr,
esnpartvie0_.induct_date,
esnpartvie0_.fiscal_week_bucket_date
Explain Analyze, buffer plan for same is and link (https://explain.depesz.com/s/mr76#html)
"GroupAggregate (cost=69684.12..69684.17 rows=1 width=82) (actual time=976.163..976.228 rows=1 loops=1)"
" Group Key: esnpartvie0_.esn_id, esnpartvie0_.esn, esnpartvie0_.quarter_point, esnpartvie0_.work_order_number, esnpartvie0_.site, esnpartvie0_.customer, esnpartvie0_.adj_accum_date, esnpartvie0_.g2_otr, esnpartvie0_.induct_date, esnpartvie0_.fiscal_week_bucket_date"
" Buffers: shared hit=20301, temp read=48936 written=6835"
" -> Sort (cost=69684.12..69684.13 rows=1 width=70) (actual time=976.153..976.219 rows=14 loops=1)"
" Sort Key: esnpartvie0_.esn, esnpartvie0_.quarter_point, esnpartvie0_.work_order_number, esnpartvie0_.site, esnpartvie0_.customer, esnpartvie0_.adj_accum_date, esnpartvie0_.g2_otr, esnpartvie0_.induct_date, esnpartvie0_.fiscal_week_bucket_date"
" Sort Method: quicksort Memory: 26kB"
" Buffers: shared hit=20301, temp read=48936 written=6835"
" -> Subquery Scan on esnpartvie0_ (cost=69684.02..69684.11 rows=1 width=70) (actual time=976.078..976.158 rows=14 loops=1)"
" Buffers: shared hit=20290, temp read=48936 written=6835"
" -> GroupAggregate (cost=69684.02..69684.10 rows=1 width=2016) (actual time=976.077..976.155 rows=14 loops=1)"
" Group Key: e.esn_id, w.number, ed.adj_accum_date, (COALESCE(ed.gate_2_otr, 0)), ed.gate_0_start, ed.gate_1_stop, p.part_id, st.name, mat.name, so.name, dr.name, hpc.hpc_status_name, module.module_name, c.customer_id, m.model_id, ef.engine_family_id, s.site_id, ws.name, ic.comment"
" Buffers: shared hit=20290, temp read=48936 written=6835"
" CTE indexed_comments"
" -> WindowAgg (cost=40573.82..45076.80 rows=225149 width=118) (actual time=182.537..291.895 rows=216974 loops=1)"
" Buffers: shared hit=5226, temp read=3319 written=3327"
" -> Sort (cost=40573.82..41136.69 rows=225149 width=110) (actual time=182.528..215.549 rows=216974 loops=1)"
" Sort Key: part_comment.part_id, part_comment.created_at DESC"
" Sort Method: external merge Disk: 26552kB"
" Buffers: shared hit=5226, temp read=3319 written=3327"
" -> Seq Scan on part_comment (cost=0.00..7474.49 rows=225149 width=110) (actual time=0.014..38.209 rows=216974 loops=1)"
" Buffers: shared hit=5223"
" -> Sort (cost=24607.21..24607.22 rows=1 width=717) (actual time=976.069..976.133 rows=14 loops=1)"
" Sort Key: w.number, ed.adj_accum_date, (COALESCE(ed.gate_2_otr, 0)), ed.gate_0_start, ed.gate_1_stop, p.part_id, st.name, mat.name, so.name, dr.name, hpc.hpc_status_name, module.module_name, c.customer_id, m.model_id, ef.engine_family_id, s.site_id, ws.name, ic.comment"
" Sort Method: quicksort Memory: 28kB"
" Buffers: shared hit=20290, temp read=48936 written=6835"
" -> Nested Loop (cost=1010.23..24607.20 rows=1 width=717) (actual time=442.381..976.017 rows=14 loops=1)"
" Buffers: shared hit=20287, temp read=48936 written=6835"
" -> Nested Loop Left Join (cost=1009.94..24598.88 rows=1 width=697) (actual time=442.337..975.670 rows=14 loops=1)"
" Join Filter: (ic.part_id = p.part_id)"
" Rows Removed by Join Filter: 824838"
" Buffers: shared hit=20245, temp read=48936 written=6835"
" -> Nested Loop Left Join (cost=1009.94..19518.95 rows=1 width=181) (actual time=56.148..57.676 rows=14 loops=1)"
" Buffers: shared hit=15019"
" -> Nested Loop Left Join (cost=1009.81..19518.35 rows=1 width=183) (actual time=56.139..57.635 rows=14 loops=1)"
" Buffers: shared hit=15019"
" -> Nested Loop Left Join (cost=1009.67..19517.67 rows=1 width=181) (actual time=56.133..57.598 rows=14 loops=1)"
" Buffers: shared hit=15019"
" -> Nested Loop Left Join (cost=1009.55..19516.82 rows=1 width=179) (actual time=56.124..57.544 rows=14 loops=1)"
" Buffers: shared hit=15019"
" -> Nested Loop Left Join (cost=1009.42..19516.04 rows=1 width=178) (actual time=56.105..57.439 rows=14 loops=1)"
" Buffers: shared hit=14991"
" -> Nested Loop Left Join (cost=1009.28..19515.37 rows=1 width=175) (actual time=56.089..57.335 rows=14 loops=1)"
" Buffers: shared hit=14963"
" -> Nested Loop Left Join (cost=1009.14..19514.77 rows=1 width=170) (actual time=56.068..57.206 rows=14 loops=1)"
" Join Filter: (e.work_scope_id = ws.work_scope_id)"
" Buffers: shared hit=14935"
" -> Nested Loop Left Join (cost=1009.14..19513.55 rows=1 width=166) (actual time=56.043..57.102 rows=14 loops=1)"
" Join Filter: (e.esn_id = p.esn_id)"
" Buffers: shared hit=14921"
" -> Nested Loop (cost=9.14..31.40 rows=1 width=125) (actual time=0.081..0.130 rows=1 loops=1)"
" Buffers: shared hit=15"
" -> Nested Loop (cost=8.87..23.08 rows=1 width=118) (actual time=0.069..0.117 rows=1 loops=1)"
" Buffers: shared hit=12"
" -> Nested Loop (cost=8.73..21.86 rows=1 width=108) (actual time=0.055..0.102 rows=1 loops=1)"
" Buffers: shared hit=10"
" -> Nested Loop (cost=8.60..21.65 rows=1 width=46) (actual time=0.046..0.091 rows=1 loops=1)"
" Buffers: shared hit=8"
" -> Hash Join (cost=8.31..13.34 rows=1 width=41) (actual time=0.036..0.081 rows=1 loops=1)"
" Hash Cond: (m.model_id = e.model_id)"
" Buffers: shared hit=5"
" -> Seq Scan on model m (cost=0.00..4.39 rows=239 width=17) (actual time=0.010..0.038 rows=240 loops=1)"
" Buffers: shared hit=2"
" -> Hash (cost=8.30..8.30 rows=1 width=28) (actual time=0.009..0.010 rows=1 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 9kB"
" Buffers: shared hit=3"
" -> Index Scan using esn_pkey on esn e (cost=0.29..8.30 rows=1 width=28) (actual time=0.006..0.006 rows=1 loops=1)"
" Index Cond: (esn_id = 140339)"
" Filter: active"
" Buffers: shared hit=3"
" -> Index Scan using work_order_pkey on work_order w (cost=0.29..8.30 rows=1 width=13) (actual time=0.008..0.008 rows=1 loops=1)"
" Index Cond: (work_order_id = e.work_order_id)"
" Buffers: shared hit=3"
" -> Index Scan using engine_family_pkey on engine_family ef (cost=0.14..0.20 rows=1 width=66) (actual time=0.009..0.009 rows=1 loops=1)"
" Index Cond: (engine_family_id = m.engine_family_id)"
" Buffers: shared hit=2"
" -> Index Scan using site_pkey on site s (cost=0.14..1.15 rows=1 width=14) (actual time=0.013..0.013 rows=1 loops=1)"
" Index Cond: (site_id = ef.site_id)"
" Buffers: shared hit=2"
" -> Index Scan using customer_pkey on customer c (cost=0.27..8.29 rows=1 width=11) (actual time=0.012..0.012 rows=1 loops=1)"
" Index Cond: (customer_id = e.customer_id)"
" Buffers: shared hit=3"
" -> Gather (cost=1000.00..19481.78 rows=29 width=41) (actual time=55.958..56.949 rows=14 loops=1)"
" Workers Planned: 2"
" Workers Launched: 2"
" Buffers: shared hit=14906"
" -> Parallel Seq Scan on part p (cost=0.00..18478.88 rows=12 width=41) (actual time=51.855..52.544 rows=5 loops=3)"
" Filter: (active AND (esn_id = 140339))"
" Rows Removed by Filter: 226662"
" Buffers: shared hit=14906"
" -> Seq Scan on work_scope ws (cost=0.00..1.10 rows=10 width=12) (actual time=0.004..0.004 rows=1 loops=14)"
" Buffers: shared hit=14"
" -> Index Scan using source_pkey on source so (cost=0.14..0.57 rows=1 width=13) (actual time=0.005..0.005 rows=1 loops=14)"
" Index Cond: (p.source_id = source_id)"
" Buffers: shared hit=28"
" -> Index Scan using status_pkey on status st (cost=0.13..0.56 rows=1 width=11) (actual time=0.004..0.004 rows=1 loops=14)"
" Index Cond: (p.status_id = status_id)"
" Buffers: shared hit=28"
" -> Index Scan using material_stream_pkey on material_stream mat (cost=0.13..0.56 rows=1 width=9) (actual time=0.004..0.004 rows=1 loops=14)"
" Index Cond: (p.material_stream_id = material_stream_id)"
" Buffers: shared hit=28"
" -> Index Scan using dr_status_pkey on dr_status dr (cost=0.13..0.56 rows=1 width=10) (actual time=0.001..0.001 rows=0 loops=14)"
" Index Cond: (p.dr_status_id = dr_status_id)"
" -> Index Scan using hpc_status_pkey on hpc_status hpc (cost=0.13..0.56 rows=1 width=10) (actual time=0.001..0.001 rows=0 loops=14)"
" Index Cond: (p.hpc_status_id = hpc_status_id)"
" -> Index Scan using module_pkey on module (cost=0.14..0.57 rows=1 width=6) (actual time=0.001..0.001 rows=0 loops=14)"
" Index Cond: (p.module_id = module_id)"
" -> CTE Scan on indexed_comments ic (cost=0.00..5065.85 rows=1126 width=520) (actual time=13.043..61.251 rows=58917 loops=14)"
" Filter: (comment_index = 1)"
" Rows Removed by Filter: 158057"
" Buffers: shared hit=5226, temp read=48936 written=6835"
" -> Index Scan using esn_dates_esn_id_key on esn_dates ed (cost=0.29..8.32 rows=1 width=20) (actual time=0.019..0.020 rows=1 loops=14)"
" Index Cond: (esn_id = 140339)"
" Filter: ((gate_3_stop_actual AND (gate_3_stop >= now())) OR (gate_3_stop IS NULL) OR ((NOT gate_3_stop_actual) AND (gate_3_stop IS NOT NULL) AND (gate_3_stop >= (now() - '730 days'::interval))))"
" Buffers: shared hit=42"
"Planning time: 6.564 ms"
"Execution time: 988.335 ms"
运行上面的select的实际View定义
with indexed_comments as (
select
part_comment.part_id,
part_comment.comment,
row_number() over (partition by part_comment.part_id
order by
part_comment.created_at desc) as comment_index
from
moa.part_comment
)
select
e.esn_id,
e.name as esn,
e.is_qp_engine as quarter_point,
w.number as work_order_number,
case
when (p.part_id is null) then 0
else p.part_id
end as part_id,
p.part_number,
p.part_description,
p.quantity,
st.name as status,
p.status_id,
mat.name as material_stream,
p.material_stream_id,
so.name as source,
p.source_id,
p.oem,
p.po_number,
p.manual_cso_commit,
p.auto_cso_commit,
coalesce(p.manual_cso_commit, p.auto_cso_commit) as calculated_cso_commit,
(coalesce(ed.adj_accum_date, (ed.gate_1_stop + coalesce(ed.gate_2_otr, 0)), ed.gate_0_start) + p.accum_offset) as adjusted_accum,
dr.name as dr_status,
p.dr_status_id,
p.airway_bill,
p.core_material,
hpc.hpc_status_name as hpc_status,
p.hpc_status_id,
module.module_name,
p.module_id,
c.name as customer,
c.customer_id,
m.name as model,
m.model_id,
ef.name as engine_family,
ef.engine_family_id,
s.label as site,
s.site_id,
case
when (coalesce(p.manual_cso_commit, p.auto_cso_commit) > coalesce(ed.adj_accum_date, (ed.gate_1_stop + coalesce(ed.gate_2_otr, 0)), ed.gate_0_start)) then 1
else 0
end as critical,
case
when (coalesce(p.manual_cso_commit, p.auto_cso_commit) <= coalesce(ed.adj_accum_date, (ed.gate_1_stop + coalesce(ed.gate_2_otr, 0)), ed.gate_0_start)) then 1
else 0
end as numshort,
case
when ((p.esn_id is not null)
and (coalesce(p.manual_cso_commit, p.auto_cso_commit) is null)) then 1
else 0
end as wa,
ed.adj_accum_date,
(ed.gate_1_stop + coalesce(ed.gate_2_otr, 0)) as g2_otr,
ed.gate_0_start as induct_date,
coalesce((coalesce(ed.adj_accum_date, (ed.gate_1_stop + coalesce(ed.gate_2_otr, 0))) - max(coalesce(p.manual_cso_commit, p.auto_cso_commit))), 0) as delta,
coalesce(ed.adj_accum_date, (ed.gate_1_stop + coalesce(ed.gate_2_otr, 0)), ed.gate_0_start) as fiscal_week_bucket_date,
p.po_line_num,
p.ship_out,
p.receipt,
p.crit_ship,
e.work_scope_id,
ws.name as work_scope,
p.late_call,
p.ex_esn,
p.accum_offset,
ic.comment as latest_comment
from
(((((((((((((((moa.esn e
join moa.work_order w
using (work_order_id))
join moa.model m
using (model_id))
join moa.engine_family ef on
((m.engine_family_id = ef.engine_family_id)))
join moa.site s on
((ef.site_id = s.site_id)))
join moa.customer c
using (customer_id))
left join moa.part p on
(((e.esn_id = p.esn_id)
and (p.active <> false))))
left join moa.work_scope ws on
((e.work_scope_id = ws.work_scope_id)))
left join moa.source so on
((p.source_id = so.source_id)))
left join moa.status st on
((p.status_id = st.status_id)))
left join moa.material_stream mat
using (material_stream_id))
left join moa.dr_status dr
using (dr_status_id))
left join moa.hpc_status hpc
using (hpc_status_id))
left join moa.module module
using (module_id))
left join indexed_comments ic on
(((ic.part_id = p.part_id)
and (ic.comment_index = 1))))
join moa.esn_dates ed on
((e.esn_id = ed.esn_id)))
where
((e.active = true)
and (((ed.gate_3_stop_actual = true)
and (ed.gate_3_stop >= now()))
or (ed.gate_3_stop is null)
or ((ed.gate_3_stop_actual = false)
and (ed.gate_3_stop is not null)
and (ed.gate_3_stop >= (now() - '730 days'::interval)))))
group by
e.esn_id,
w.number,
s.label,
c.name,
p.active,
ed.adj_accum_date,
coalesce(ed.gate_2_otr, 0),
ed.gate_0_start,
ed.gate_1_stop,
p.part_id,
st.name,
mat.name,
so.name,
dr.name,
hpc.hpc_status_name,
module.module_name,
c.customer_id,
m.name,
m.model_id,
ef.name,
ef.engine_family_id,
s.site_id,
ws.name,
ic.comment;
真是个可怕的问题。
大多数时候是这样的:
-> CTE Scan on indexed_comments ic (cost=0.00..5065.85 rows=1126 width=520) (actual time=13.043..61.251 rows=58917 loops=14)"
而罪魁祸首是对上兄弟节点的错误估计。它认为它需要做一次CTE扫描,但实际上它需要做14次(尽管每次显然返回相同的答案)。如果它知道要重复执行,它就会建立一个哈希表,而不是每次都遍历它。但是由于建立哈希需要一次迭代,如果它认为它首先只需要一次迭代,那么它似乎不会保存任何东西。
我不知道如何解决估算问题。但是你可以动态地计算排名,而不是预先计算所有的,然后需要搜索它们。对于LATERAL连接,您可以这样做。
改变left join indexed_comments ic on
(((ic.part_id = p.part_id)
and (ic.comment_index = 1))))
left join lateral (select comment from part_comment pc where p.part_id=pc.part_id order by created_at desc limit 1) ic on true
并去掉with indexed_comments as...
为了更快,你需要一个索引ON part_comment (part_id, created_at)