我正在尝试解决我在postgresql中查询的性能问题。
正在建模的一般概念:要购买和分配的软件许可证。我想我已经删除了其他建模的东西,现在它与标准的酒店房间预订系统非常相似,只是当前任务(酒店预订)是正常的,没有已知的结束日期。
查询目的:这是一个视图,该视图组装了显示有关许可证的信息及其来自购买的信息。当应用程序查询视图时,它提供了tLicence.id
,以便返回一行。
查询中剩下的类似于酒店的概念:
- 一些许可协议限制了如何重新分配软件的速度;这已被用力地编码为1天。
- 从理论上讲,许可证可以同时进行过去和现在。此不应发生,并且该应用程序会阻止它,但是如果人类在现实世界中犯了错误,则该应用程序确实允许将该错误输入系统。这显然与普通酒店系统不同,如果客人走进错误的房间,目前的居民会反对。
带有别名purchase_quantities_assignnments
的嵌套SELECT
是数据库中的视图(为方便起见,在此处夹住)。理想情况下,我希望对我的性能问题进行任何修复,以免将此视图的修改版本归为查询。理想情况下,该视图可以继续存在于原样,并以其他方式在其他查询中使用。
问题
如果我使用WHERE tLicences.id = 19
查询此视图(查询),则结果需要很长时间才能生成。特别是,它似乎正在为periodsOfAvailability_start
(慢速)生成整个集合,然后加入;该结论基于解释分析GroupAggregate
返回10行(这是购买的数量)。i 感觉喜欢查询计划者应该能够确定tAssignments.purchase_id
可用于显着减少需要生成多少periodsOfAvailability_start
。
但是,如果我查询使用WHERE tLicences.id = 19 AND tLicences.purchase_id = ?
[?
是该许可证的购买ID]的查询(查询),则查询按预期运行,仅生成具有该购买ID的periodsOfAvailability_start
的集合;该结论基于解释分析GroupAggregate
返回1行(这是许可证所属的购买数)。
查询
SELECT *
FROM test.tPurchases AS tPurchases
INNER JOIN test.tLicences
ON tLicences.purchase_id = tPurchases.id
LEFT JOIN (
SELECT
purchase_id,
SUM(
CASE
assignment_newer_id IS NOT null
WHEN true THEN 1
WHEN false THEN 0
END
) AS prchs_quantity_assigned,
SUM(
CASE
assignment_newer_id IS null AND
current_timestamp
BETWEEN licence_availability_start AND
licence_availability_end
WHEN true THEN 1
WHEN false THEN 0
END
) AS prchs_quantity_notAssignedAndCanBeAssigned,
SUM(
CASE
assignment_newer_id IS null AND
current_timestamp < licence_availability_start
WHEN true THEN 1
WHEN false THEN 0
END
) AS prchs_quantity_notAssignedAndCannotBeAssigned
FROM (
SELECT
tPurchases.id AS purchase_id,
tPurchases.date_ AS purchase_date,
tLicences.id AS licence_id,
GREATEST(
tPurchases.date_,
older.end_,
older.start + '1 day'::interval
) AS licence_availability_start,
CASE
WHEN newer.id IS NULL THEN 'infinity'
ELSE newer.start - '1 day'::interval
END AS licence_availability_end,
COALESCE(newer.start, 'infinity') AS licence_availability_uninstallBy,
older.id AS assignment_older_id,
older.start AS assignment_older_start,
older.end_ AS assignment_older_end,
newer.id AS assignment_newer_id,
newer.start AS assignment_newer_start,
newer.end_ AS assignment_newer_end
FROM test.tLicences
INNER JOIN test.tPurchases
ON tPurchases.id = tLicences.purchase_id
LEFT JOIN test.tAssignments AS older
ON (
NOT older.deleted AND
older.licence_id = tLicences.id
)
LEFT JOIN test.tAssignments AS newer
ON (
NOT newer.deleted AND
newer.id <> older.id AND
newer.licence_id = older.licence_id
)
WHERE
NOT tLicences.deleted
UNION
SELECT
tPurchases.id AS purchase_id,
tPurchases.date_ AS purchase_date,
tLicences.id AS licence_id,
tPurchases.date_ AS licence_availability_start,
oldest.start - '1 day'::interval AS licence_availability_end,
oldest.start AS licence_availability_uninstallBy,
null AS assignment_older_id,
null AS assignment_older_start,
null AS assignment_older_end,
oldest.id AS assignment_newer_id,
oldest.start AS assignment_newer_start,
oldest.end_ AS assignment_newer_end
FROM test.tLicences
INNER JOIN test.tPurchases
ON tPurchases.id = tLicences.purchase_id
INNER JOIN test.tAssignments AS oldest
ON oldest.licence_id = tLicences.id
WHERE
NOT tLicences.deleted AND
NOT oldest.deleted
) AS periodsOfAvailability_start
WHERE
(assignment_newer_id IS null OR assignment_newer_end IS null)
GROUP BY purchase_id
) AS purchase_quantities_assignnments
ON
purchase_quantities_assignnments.purchase_id = tPurchases.id
WHERE
tLicences.id = 19 /* [Unexpected behaviour] The full set for "purchase_quantities_assignnments" is generated */
--tLicences.id = 19 AND tLicences.purchase_id = ? /* [Expected behaviour] Only the single relevant row for "purchase_quantities_assignnments" appears to be generated */
--tLicences.id = 19 AND tPurchases.id = ? /* [Expected behaviour] Only the single relevant row for "purchase_quantities_assignnments" appears to be generated */
--tLicences.purchase_id = ? /* [Expected behaviour] Only the single relevant row for "purchase_quantities_assignnments" appears to be generated. Note: This is a different query *result* than the others */
问题:是否有某种方法可以解决此问题,而无需提供tLicences.purchase_id
?
数据库版本: PostgreSQL 9.0
SQL生成模式,表和填充这些表:
这有点长时间,因为我想要一个类似于实际数据的数量。如果运行时间是问题,可以减少许可证(30000)和任务数量(100000)。
CREATE SCHEMA test;
CREATE TABLE test.tPurchases (
id serial not null,
date_ date not null,
/* … */
deleted boolean not null DEFAULT false,
PRIMARY KEY (id)
);
CREATE TABLE test.tLicences (
id serial not null,
purchase_id integer not null,
/* … */
deleted boolean not null DEFAULT false,
PRIMARY KEY (id),
FOREIGN KEY (purchase_id)
REFERENCES test.tPurchases (id)
ON UPDATE RESTRICT
ON DELETE RESTRICT
);
CREATE INDEX ON test.tLicences(purchase_id);
CREATE TABLE test.tAssignments (
id serial not null,
licence_id integer not null,
start date not null,
end_ date,
/* … */
deleted boolean not null DEFAULT false,
PRIMARY KEY (id),
FOREIGN KEY (licence_id)
REFERENCES test.tLicences (id)
ON UPDATE RESTRICT
ON DELETE RESTRICT,
CHECK (start <= end_)
);
CREATE INDEX ON test.tAssignments(licence_id);
INSERT INTO test.tPurchases(id, date_)
SELECT
id,
'2000-01-01'::timestamp + random() * '1 year'::interval AS date_
FROM generate_series(1, 10) AS id
;
INSERT INTO test.tLicences(id, purchase_id, deleted)
SELECT
id,
trunc(random() * 10 + 1) AS purchase_id,
(random() > 0.99) AS deleted
FROM generate_series(1, 30000) AS id
;
INSERT INTO test.tAssignments(id, licence_id, start, end_, deleted)
SELECT
assignments.id,
assignments.licence_id,
tPurchases.date_ + ((rank * 20 + random() * 10) || ' days')::interval AS start,
CASE
assignments.rank = max(assignments.rank) OVER (PARTITION BY assignments.licence_id) AND
random() > 0.5
WHEN true THEN null
ELSE tPurchases.date_ + ((rank * 20 + 10 + random() * 10) || ' days')::interval
END AS end_,
(random() > 0.95) AS deleted
FROM (
SELECT
assignments.id,
assignments.licence_id,
rank() OVER (PARTITION BY assignments.licence_id ORDER BY assignments.id) AS rank
FROM (
SELECT
id,
trunc(random() * 30000 + 1) AS licence_id
FROM generate_series(1, 100000) AS id
) AS assignments
) AS assignments
INNER JOIN test.tLicences
ON tLicences.id = assignments.licence_id
INNER JOIN test.tPurchases
ON tPurchases.id = tLicences.purchase_id
;
您可能需要运行统计信息,但通常可以强制使用CTE的优化。在这里,我还将您所有的子征服带到CTE中,以表明它很清楚:
WITH myPurchases AS
(
SELECT *
FROM test.tPurchases AS tPurchases
WHERE tLicences.id = 19
), periodsOfAvailability_start AS
(
SELECT
tPurchases.id AS purchase_id,
tPurchases.date_ AS purchase_date,
tLicences.id AS licence_id,
GREATEST(tPurchases.date_, older.end_, older.start + '1 day'::interval) AS licence_availability_start,
CASE WHEN newer.id IS NULL THEN 'infinity' ELSE newer.start - '1 day'::interval END AS licence_availability_end,
COALESCE(newer.start, 'infinity') AS licence_availability_uninstallBy,
older.id AS assignment_older_id,
older.start AS assignment_older_start,
older.end_ AS assignment_older_end,
newer.id AS assignment_newer_id,
newer.start AS assignment_newer_start,
newer.end_ AS assignment_newer_end
FROM test.tLicences
INNER JOIN myPurchases AS tPurchases ON tPurchases.id = tLicences.purchase_id
LEFT JOIN test.tAssignments AS older ON (NOT older.deleted AND older.licence_id = tLicences.id)
LEFT JOIN test.tAssignments AS newer ON (NOT newer.deleted AND newer.id <> older.id AND newer.licence_id = older.licence_id)
WHERE NOT tLicences.deleted
UNION
SELECT
tPurchases.id AS purchase_id,
tPurchases.date_ AS purchase_date,
tLicences.id AS licence_id,
tPurchases.date_ AS licence_availability_start,
oldest.start - '1 day'::interval AS licence_availability_end,
oldest.start AS licence_availability_uninstallBy,
null AS assignment_older_id,
null AS assignment_older_start,
null AS assignment_older_end,
oldest.id AS assignment_newer_id,
oldest.start AS assignment_newer_start,
oldest.end_ AS assignment_newer_end
FROM test.tLicences
INNER JOIN myPurchases AS tPurchases ON tPurchases.id = tLicences.purchase_id
INNER JOIN test.tAssignments AS oldest ON oldest.licence_id = tLicences.id
WHERE NOT tLicences.deleted AND NOT oldest.deleted
), purchase_quantities_assignnments AS
(
SELECT
purchase_id,
SUM(CASE WHEN assignment_newer_id IS NOT null THEN 1 ELSE 0 END) AS prchs_quantity_assigned,
SUM(CASE WHEN assignment_newer_id IS null AND current_timestamp BETWEEN licence_availability_start AND licence_availability_end THEN 1 ELSE false END) AS prchs_quantity_notAssignedAndCanBeAssigned,
SUM(CASE WHEN assignment_newer_id IS null AND current_timestamp < licence_availability_start THEN 1 ELSE 0 END) AS prchs_quantity_notAssignedAndCannotBeAssigned
FROM periodsOfAvailability_start
WHERE assignment_newer_id IS null OR assignment_newer_end IS null
GROUP BY purchase_id
)
SELECT *
FROM myPurchases AS tPurchases
INNER JOIN test.tLicences ON tLicences.purchase_id = tPurchases.id
LEFT JOIN purchase_quantities_assignnments ON purchase_quantities_assignnments.purchase_id = tPurchases.id