从mysql数据库中选择多个字段副本

mm5n2pyu  于 2021-06-21  发布在  Mysql
关注(0)|答案(1)|浏览(302)

我有一个旧论坛,其中包含重复的第一个帖子线程(可能不同的答复)。我想删除除一个线程以外的所有线程(使该线程具有最高的视图计数)。
我有以下sql查询来帮助识别重复的线程,但我找不到方法让它只列出xf\u thread.view\u count列中值最低的重复线程:

SELECT
    t.thread_id, MIN(t.view_count)
FROM
    xf_thread t
INNER JOIN
    xf_post p ON p.thread_id = t.thread_id
WHERE
    t.first_post_id = p.post_id
GROUP BY
    t.title,
    t.username,
    p.message
HAVING
    COUNT(t.title) > 1
    AND COUNT(t.username) > 1
    AND COUNT(p.message) > 1;

目前,这个查询可以正确地对线程进行分组,但它只显示一个随机的线程id,而不是对应于min(view\u count)的线程id。
我已经阅读了如何解决这个问题,但我不知道如何实现这一点-因为看起来不可能按查询分组对行排序。
编辑
多亏了madhur的帮助,查询现在返回所有要删除的线程id。但是,我可以找出如何删除具有匹配线程id的行。下面是我尝试使用的查询(它只是一直在运行,而select查询(https://stackoverflow.com/a/52314208/2469308)以秒为单位运行:

DELETE FROM xf_thread 
WHERE  thread_id IN (SELECT Substring_index(Group_concat(DISTINCT t.thread_id 
                                            ORDER BY 
                                            t.view_count 
                                                                ASC 
                                            SEPARATOR ','), 
                                   ',', 1) AS 
                            thread_id_with_minimum_views 
                     FROM   (SELECT * 
                             FROM   xf_thread) t 
                            INNER JOIN xf_post p 
                                    ON p.thread_id = t.thread_id 
                     WHERE  t.first_post_id = p.post_id 
                            AND t.user_id = 0 
                            AND t.reply_count < 2 
                     GROUP  BY t.title, 
                               t.username, 
                               p.message 
                     HAVING Count(t.title) > 1 
                            AND Count(t.username) > 1 
                            AND Count(p.message) > 1 
                     ORDER  BY t.thread_id);
9fkzdhlc

9fkzdhlc1#

一个非常棘手的解决办法是 thread_id 排序依据 view_count 在一个小组里。然后,我们可以利用字符串操作来获得 thread_id 最小值 view_count .
在你的 SELECT 子句,而不是 t.thread_id ,您可以尝试以下操作:

SUBSTRING_INDEX(GROUP_CONCAT(DISTINCT t.thread_id 
                             ORDER BY t.view_count ASC 
                             SEPARATOR ','), 
                ',', 
                1) AS thread_id_with_minimum_views

现在,根据 SELECT 查询以识别具有最小视图的重复记录, DELETE 查询以从 xf_thread 表格如下:

DELETE t_delete FROM xf_thread AS t_delete 
INNER JOIN (SELECT CAST(SUBSTRING_INDEX(GROUP_CONCAT(DISTINCT t.thread_id ORDER BY t.view_count ASC SEPARATOR ','), ',', 1) AS UNSIGNED) AS tid_min_view 
            FROM (SELECT * FROM xf_thread) t 
            INNER JOIN xf_post p ON p.thread_id = t.thread_id 
            WHERE t.first_post_id = p.post_id 
              AND t.user_id = 0 
              AND t.reply_count < 2 
            GROUP BY t.title, t.username, p.message 
            HAVING Count(t.title) > 1 
               AND Count(t.username) > 1 
               AND Count(p.message) > 1 
            ORDER BY t.thread_id) AS t_dup 
  ON t_delete.thread_id = t_dup.tid_min_view

相关问题