Postgresql查询:根据组大小更新限制记录数的状态



我有一个postgresql表包含一个电子邮件地址列表。这个表有三列,Email, EmailServer(例如,gmail.com, outlook.com, msn.com和yahoo.com.ca等)和Valid (boolean)。

现在,我想通过EmailServer对这些电子邮件进行分组,然后将每个大组(count>=6)的前3条记录更新为Valid = true,而将每个组的其余记录保留为Valid = false。

我没有通过下面的查询得到想要的输出:

UPDATE public."EmailContacts"   
SET "Valid"=true
WHERE "EmailServer" IN (
SELECT "EmailServer"
FROM public."EmailContacts"
GROUP by "EmailServer"
HAVING count(*) >=6
LIMIT 5)

请帮忙修改,以便得到预期的结果。将非常感谢您的任何帮助!

WITH major_servers AS (
SELECT email_server
FROM email_address
GROUP by email_server
HAVING count(*) >=6
),
enumerated_emails AS (
SELECT email,
email_server,
row_number() OVER (PARTITION BY email_server ORDER BY email) AS row_number --TODO:: ORDER BY email - attention
FROM email_address
WHERE email_server IN (SELECT email_server FROM major_servers)
)
UPDATE email_address
SET valid = true
WHERE email IN (SELECT email
FROM enumerated_emails ee
WHERE ee.row_number <= 3);
  • 第一个查询major_servers查找存在5个以上邮件服务器的主要组。
  • 第二个查询enumerated_emails自然顺序枚举电子邮件(看到TODO注释,我认为你应该选择另一个ORDER BY标准),它属于使用窗口函数row_number()的主要组。
  • 最后一个查询更新每个主要服务器组的前3行。

在这里查找sql-fiddle

您需要获取服务器,然后从哪个服务器订购邮件,然后执行更新。像这样:

WITH DataSourceServers AS 
(
SELECT "EmailServer"
FROM public."EmailContacts"
GROUP by "EmailServer"
HAVING count(*) >=6
),DataSourceEmails AS
(
SELECT "Email", row_number() OVER (PARTITION BY "EmailServer" ORDER BY "Email") AS rn
FROM public."EmailContacts"
WHERE "EmailServer" IN (SELECT "EmailServer" FROM DataSourceServers)
)
UPDATE public."EmailContacts"
SET "Valid" = true
FROM  public."EmailContacts" E
INNER JOIN DataSourceEmails SE
WHERE E."EmailServer" = SE."EmailServer"
AND E."Email" = SE."Email"
AND SE.rn <= 3;

最新更新