Postgresql查询:根据组大小更新限制记录数的状态

kpbwa7wx  于 2023-01-13  发布在  PostgreSQL
关注(0)|答案(2)|浏览(127)

我有一个postgresql表,包含一个电子邮件地址列表,该表有三列,Email,EmailServer(例如,gmail.com,outlook.commsn.com,和yahoo.com.ca等),和Valid(布尔值)。
现在,我想按EmailServer对这些电子邮件进行分组,然后将每个大组(计数〉=6)的前3条记录更新为Valid = true,而将每个组的其余记录保留为Valid = false。
我无法通过以下查询获得所需的输出:

UPDATE public."EmailContacts"   
SET "Valid"=true
WHERE "EmailServer" IN (
    SELECT "EmailServer"
    FROM public."EmailContacts"
    GROUP by "EmailServer"
    HAVING count(*) >=6
    LIMIT 5)

请帮助修改,以便得到预期的效果。将非常感谢您的任何帮助!

o4hqfura

o4hqfura1#

WITH major_servers AS (
    SELECT email_server
    FROM email_address
    GROUP by email_server
    HAVING count(*) >=6
),
enumerated_emails AS (
    SELECT email,
           email_server,
           row_number() OVER (PARTITION BY email_server ORDER BY email) AS row_number --TODO:: ORDER BY email - attention
    FROM email_address
    WHERE email_server IN (SELECT email_server FROM major_servers)
)
UPDATE email_address
SET valid = true
WHERE email IN (SELECT email
                FROM enumerated_emails ee
                WHERE ee.row_number <= 3);
  • 第一个查询major_servers查找存在5个以上电子邮件服务器的主要组。
  • 第二个查询enumerated_emails使用窗口函数row_number()按照邮件的自然顺序(* 请参见TODO注解,我认为您应该选择另一个ORDER BY条件 *)枚举属于主要组的邮件。
  • 最后一个查询更新每个主要服务器组中的前3行。

在此处查找sql-fiddle

9gm1akwq

9gm1akwq2#

你需要获得服务器,然后排序邮件从哪个,然后执行更新。

WITH DataSourceServers AS 
(
    SELECT "EmailServer"
    FROM public."EmailContacts"
    GROUP by "EmailServer"
    HAVING count(*) >=6
),DataSourceEmails AS
(
    SELECT "Email", row_number() OVER (PARTITION BY "EmailServer" ORDER BY "Email") AS rn
    FROM public."EmailContacts"
    WHERE "EmailServer" IN (SELECT "EmailServer" FROM DataSourceServers)
)
UPDATE public."EmailContacts"
SET "Valid" = true
FROM  public."EmailContacts" E
INNER JOIN DataSourceEmails SE
WHERE E."EmailServer" = SE."EmailServer"
    AND E."Email" = SE."Email"
    AND SE.rn <= 3;

相关问题