我需要一些关于PostgreSQL查询的帮助。我有下面的SELECT查询,它需要大约30秒才能在一个包含大约100000和200000个条目的表上运行。
SELECT i.id, i.debit_nr, i.pat_id, i.pat_name, i.invoice_id, i.invoice_date, i.due_date, i.client_short, i.payment, i.payment_option, i.marker, i.comment, sum(t.Sum) AS i_sum, i.import_date
FROM invoices AS i
LEFT JOIN invoice_items AS t ON t.invoice_id = i.id
JOIN jobs AS j ON i.job_id = j.id
GROUP BY i.id
我发现似乎很慢的部分只是发票表上的SELECT,因为如果我运行
SELECT i.id, i.debit_nr, i.pat_id, i.pat_name, i.invoice_id, i.invoice_date,
i.due_date, i.client_short, i.payment, i.payment_option, i.marker, i.comment, i.import_date
FROM invoices AS i
它几乎需要同样的时间。
GroupAggregate (cost=63048.71..65737.16 rows=110203 width=76) (actual time=1421.792..1785.528 rows=110203 loops=1)
Group Key: i.id
-> Sort (cost=63048.71..63577.52 rows=211523 width=76) (actual time=1421.772..1573.998 rows=211527 loops=1)
Sort Key: i.id
Sort Method: external merge Disk: 19944kB
-> Hash Right Join (cost=24793.35..34938.02 rows=211523 width=76) (actual time=473.877..1010.362 rows=211527 loops=1)
Hash Cond: (t.invoice_id = i.id)
-> Seq Scan on invoice_items t (cost=0.00..3878.23 rows=211523 width=12) (actual time=0.035..112.034 rows=211523 loops=1)
-> Hash (cost=22123.81..22123.81 rows=110203 width=72) (actual time=472.566..472.566 rows=110203 loops=1)
Buckets: 65536 Batches: 4 Memory Usage: 3592kB
-> Hash Join (cost=777.49..22123.81 rows=110203 width=72) (actual time=7.784..334.883 rows=110203 loops=1)
Hash Cond: (i.job_id = j.id)
-> Seq Scan on invoices i (cost=0.00..19831.03 rows=110203 width=76) (actual time=0.005..170.120 rows=110203 loops=1)
-> Hash (cost=705.55..705.55 rows=5755 width=8) (actual time=7.707..7.707 rows=5755 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 289kB
-> Seq Scan on jobs j (cost=0.00..705.55 rows=5755 width=8) (actual time=0.004..4.741 rows=5755 loops=1)
Planning time: 0.874 ms
Execution time: 1824.846 ms
问题是,无论我是在id字段上添加索引,还是在这个选择中添加所有需要的字段,都无关紧要。
我怎样才能加快速度?
附言:这是在Windows服务器上的PostgreSQL 9.0。
尝试使用相关子查询编写查询:
SELECT i.*,
(SELECT SUM(it.Sum)
FROM invoice_items it
WHERE it.invoice_id = i.id
) as i_sum
FROM invoices i ;
避免外部聚合可能有助于提高性能(尽管Postgres有一个很好的优化器,所以这并不总是正确的。您希望invoice-items, invoice_id, sum. I left
作业的in index"退出查询,因为它似乎没有被使用。