如何在Postgresql中从查询的表中只筛选最大值



我是Postgresql的新手,我的问题可以简化为以下内容:假设我有两张表:

表A:

id | join_value | filter_data1 | filter_data2
---------------------------------------------
1  |    1       |      "Yes"   |     1   
2  |    1       |      "Yes"   |     3
3  |    2       |      "No"    |     0

表B:

id | join_value | filter_data1 | filter_data2 |   date
---------------------------------------------------------
1  |    3       |      "Yes"   |     0        |  1/3/2021
2  |    1       |      "Yes"   |     17       |  1/3/2021
3  |    1       |      "No"    |     -1       |  1/2/2021
4  |    1       |      "Yes"   |     32       |  1/2/2021
5  |    1       |      "Yes"   |     40       |  1/3/2021

我想在筛选数据上筛选这些表,然后在联接值上联接它们。问题是,我只想获取日期为==MAX(日期(的值。下面是我尝试的一个查询示例。

SELECT * FROM 
(SELECT * FROM A
WHERE filter_data1 = "Yes"
AND filter_data2 > 2) 
AS a_tab
JOIN
(SELECT * FROM B
WHERE filter_data1 = "Yes"
AND filter_data2 > 16)
AS b_tab
ON a_tab.join_value = b_tab.join_value;

这将给我下表:

id | join_value | filter_data1 | filter_data2 | id | filter_data1 | filter_data2 |   date
------------------------------------------------------------------------------------------
2  |   1        |    "Yes"     |     3        | 2  |    "Yes"     |     17       | 1/3/2021
2  |   1        |    "Yes"     |     3        | 4  |    "Yes"     |     32       | 1/2/2021
2  |   1        |    "Yes"     |     3        | 5  |    "Yes"     |     40       | 1/3/2021

但问题是,我也想做一个"WHERE date=MAX(date(">

生成的表格如下:

id | join_value | filter_data1 | filter_data2 | id | filter_data1 | filter_data2 |   date
------------------------------------------------------------------------------------------
2  |   1        |    "Yes"     |     3        | 2  |    "Yes"     |     17       | 1/3/2021
2  |   1        |    "Yes"     |     3        | 5  |    "Yes"     |     40       | 1/3/2021

有人知道如何做到这一点吗?

首先,让我给您一个提示,如何以可读性更好的方式编写现有的select查询:

SELECT
a.*, b.*
FROM a
INNER JOIN b ON b.join_value = a.join_value
WHERE a.filter_data1 = 'YES' AND a.filter_data2 > 2 
AND b.filter_data1 = 'YES' AND b.filter_data2 > 16

现在,我将向这个查询添加另一列,它保存输出的date列的最大值。因此,我们可以使用WINDOW FUNCTION:

SELECT
a.*, b.*, MAX(b.date) OVER ()
FROM a
INNER JOIN b ON b.join_value = a.join_value
WHERE a.filter_data1 = 'YES' AND a.filter_data2 > 2 
AND b.filter_data1 = 'YES' AND b.filter_data2 > 16

由于WINDOW FUNCTION是查询的一部分,是在最后一步中计算的,因此我们不能在此处添加条件。因此,我们将此查询用作子查询,并将条件添加到顶级查询:

SELECT
*
FROM (
SELECT
a.*, b.*, MAX(b.date) OVER () AS max_date
FROM a
INNER JOIN b ON b.join_value = a.join_value
WHERE a.filter_data1 = 'YES' AND a.filter_data2 > 2 
AND b.filter_data1 = 'YES' AND b.filter_data2 > 16
) t
WHERE t.date = t.max_date

这将为您提供所需的结果。

最新更新