按另一个表分组,该表已用两个子查询分组

jvidinwx  于 2021-07-24  发布在  Java
关注(0)|答案(12)|浏览(415)

我有一张这样的table

Table1

ID      |    Val         |     Val2       |
606541  |3175031503131004|3175032612900004|
606542  |3175031503131004|3175032612900004|
677315  |3175031503131004|3175032612980004|
222222  |1111111111111111|8888888888888888|
231233  |1111111111111111|3175032612900004|
111111  |9999992222211111|1111111111111111|
57      |3173012102121018|3173015101870020|
59      |3173012102121018|3173021107460002|
2       |900             |7000            |
4       |900             |7001            |

我有两个条件 Val 以及 Val2 . 如果 Val :
val列至少有两个或多个重复值,并且
val2列没有重复值(唯一)
例如:
样本1

ID      |    Val         |     Val2       |
 606541  |3175031503131004|3175032612900004|
 606542  |3175031503131004|3175032612900004|  
 677315  |3175031503131004|3175032612980004|

 False, because  even the Val column 
 had two or more duplicate but the Val2 
 had dulicate value (ID 606541  and 606542)

示例预期1个结果

No records

样本2

ID      |    Val         |     Val2       |
 222222  |1111111111111111|8888888888888888|
 231233  |1111111111111111|3175032612900004|   
 111111  |9999992222211111|1111111111111111|

 True, Because the condition is match, 
 Val column had duplicate value AND Val2 had unique values

样本2预期结果

ID      |    Val         |     Val2       |
 222222  |1111111111111111|8888888888888888|
 231233  |1111111111111111|3175032612900004|

样品3

ID      |    Val         |     Val2       |
 606541  |3175031503131004|3175032612900004|
 606542  |3175031503131004|3175032612900004|
 677315  |3175031503131004|3175032612980004|
 222222  |1111111111111111|8888888888888888|     
 231233  |1111111111111111|3175032612900004|
 111111  |9999992222211111|1111111111111111|

 Note : This is false condition, Because even the value for id 606541, 606542, and
 677315 in column Val had duplicate value at least 
 two or more but the value in column Val2 had no unique value (it could be true condition if id 606541, 
 606542, and 677315 had 3 different value on Val2).

 NOte 2 : for Id 222222 and 231233 that had duplicate value, this is still false, because the column 
 Val2 with ID 231233 had the same value with ID 606542 and 606541 (3175032612900004), so it didnt match 
 the second condition which only have no duplicate value

样本3预期结果

No records

现在回到 Table1 在前面的例子中,我尝试用这个查询显示来自两个条件的结果

SELECT
tb.* FROM table1 tb 
WHERE
    tb.Val2 IN (
    SELECT ta.Val2 
    FROM (
        SELECT
            t.* 
        FROM
            table1 t 
        WHERE
            t.Val IN ( 
            SELECT Val FROM table1 
            GROUP BY Val 
            HAVING count( Val ) > 1 ) 
        ) ta 
    GROUP BY
        ta.Val2 
    HAVING
    count( ta.Val2 ) = 1 
    )

结果呢

ID         Val                   Val2
677315  3175031503131004    3175032612980004
222222  1111111111111111    8888888888888888
57      3173012102121018    3173015101870020
59      3173012102121018    3173021107460002
2       900                  7000            
4       900                  7001

我想结果是这样的:

ID         Val                   Val2
57  3173012102121018    3173015101870020
59  3173012102121018    3173021107460002
2       900             7000            
4       900             7001

我的问题有问题吗?
这是我的小提琴。

e0bqpujr

e0bqpujr1#

请原谅任何错误,因为这将是我在这个论坛上的第一个答案。你也可以试试下面,我同意窗口功能的答案。

SELECT t.*
FROM   table1 t 
WHERE  t.val IN (SELECT val 
                   FROM table1 
                 GROUP BY val 
                 HAVING COUNT(val) > 1 
                    AND COUNT(val) = COUNT(DISTINCT val2)
                 )
AND    t.val NOT IN (SELECT t.val
                     FROM   table1 t
                     WHERE  EXISTS (SELECT 1
                             FROM   table1 tai
                             WHERE  tai.id != t.id
                             AND    tai.val2 = t.val2));

/where子句的第一部分确保我们在val2列中为val列中的重复值设置了不同的值
带有notin的where子句的第二部分告诉我们,对于val2列中的值,不同id之间没有值共享
/
--逆序查询(不确定是否给出预期结果)

SELECT t.*
FROM   table2 t
WHERE  t.val IN (SELECT val FROM table2 GROUP BY val HAVING COUNT(val) = 1)
AND    t.val2 IN (SELECT t.val2
                  FROM   table2 ta
                  WHERE  EXISTS (SELECT 1
                          FROM   table2 tai
                          WHERE  tai.id != ta.id
                          AND    tai.val = ta.val));
cwdobuhd

cwdobuhd2#

你必须使用 Group By 寻找 val & val2 具有重复值并且需要使用 Inner Join 以及 Left Join 为了在给定条件下包括/消除记录(反对 IN , NOT IN 在处理大数据时可能导致性能问题的子句)。
请查找以下查询:

select t1.*from table1 t1 left join
      (select val from table1
       where val2 in (select val2 from table1 group by val2 having count(id) > 1)
        ) t2
 on t1.val = t2.val
 inner join
     (select val from table1 group by val having count(id) >1) t3
     on t1.val = t3.val
 where t2.val is null

反向条件查询:

select t1.*from table1 t1 inner join
       (select val from table1 group by val having count(id) = 1)
         t2
 on t1.val = t2.val
 inner join
     (select val2 from table1 group by val2 having count(id) >1) t3
     on t1.val2 = t3.val2

请在这里查找这两个查询的fiddle。

q3aa0525

q3aa05253#

你能试试这个让我知道结果吗?sql小提琴

SELECT t1.id, t1.val, t1.val2 FROM table1 t1
JOIN (
  select val from
  (select id, val, val2 from table1 group by val2 having count(1) = 1) a
  group by a.val having count(1) > 1
)t2 on t1.val = t2.val;
wqnecbli

wqnecbli4#

您可以使用分组方式:

select * from (select * from #table1 where Val2 in (select Val2 val from #table1 group by Val2 having COUNT(*) =1 )) select1
         where select1.val in  (select Val val from #table1 group by Val having COUNT(*) >1)

或者你可以使用等级:

select * from  ( SELECT 
     i.id,
    i.Val val,
    RANK() OVER (PARTITION BY i.val ORDER BY i.id DESC) AS Rank1,
        RANK() OVER (PARTITION BY i.val2 ORDER BY i.id DESC) AS Rank2
FROM #table1 AS i 

) select1 where  select1.Rank1 >1 or select1.Rank2 =2
ldxq2e6h

ldxq2e6h5#

你不需要一组一组的人。替补人选会做得很好。

SELECT * FROM MyTable a
WHERE  (SELECT Count(*) FROM MyTable b WHERE  a.val = b.val) >= 2 
AND (SELECT Count(*)  FROM MyTable c WHERE  a.val2 = c.val2) = 1;

这张table看起来好像是3张完全相同的table,但只有一张。第一个子选择
(从mytable b中选择count(),其中a.val=b.val)
返回一个包含“val”在表中出现次数的数字;如果至少有两个,我们就可以走了。第二个子选择
(从mytable c中选择count(
),其中a.val2=c.val2)
返回一个包含“val2”在表中出现次数的数字;如果它是1,并且第一个子选择返回至少2,那么我们打印记录。

iqjalb3h

iqjalb3h6#

如果你想解决问题,我想这会有帮助的。
我得到的val2s没有重复的vals有1个以上的重复和连接

Select t.* from 
table1 t
inner join 
(Select val2 from table1 group by val2 having count(*) = 1) tv2 on t.val2 = tv2.val2
inner join 
(Select val from table1 group by val having count(*) > 1) tv on t.val = tv.val;
wdebmtf2

wdebmtf27#

你可以用它 EXISTS 以及 NOT EXISTS .
如果你只想要那一列 Val :

select t1.val from table1 t1
where not exists (
  select 1 from table1 
  where val = t1.val and val2 in (select val2 from table1 group by val2 having count(*) > 1)
)
group by t1.val
having count(t1.val) > 1

如果要整行:

select t1.* from table1 t1
where exists (select 1 from table1 where id <> t1.id and val = t1.val)
and not exists (
  select 1 from table1 
  where val = t1.val and val2 in (select val2 from table1 group by val2 having count(*) > 1)
)

以及一个针对mysql 8.0+的窗口函数解决方案:

select t.id, t.val, t.val2
from (
  select *, max(counter2) over (partition by val) countermax
  from (
    select *,
      count(*) over (partition by val) counter,
      count(*) over (partition by val2) counter2
    from table1
  ) t
) t 
where t.counter > 1 and t.countermax = 1

请看演示。

yzckvree

yzckvree8#

公共表表达式可能有助于提高可读性,也可能有助于提高性能。

with dup as (select  val, count(*) -- two or more of val
             from table1
             group by val
             having count(*)>1)   
select  tb1.* 
from table1 tb1
 inner join dup
  on dup.val = tb1.val
where not exists (select val2, count(*) -- Not exists is generally fast
                  from table1
                  where val = tb1.val
                  group by 1
                  having count(*) > 1)

小提琴

wqsoz72f

wqsoz72f9#

我正在浏览你的数据集,当你把结果和原始数据集进行比较时,我觉得你的最终结果是准确的。您使用的标准是: Val 至少复制一次 Val2 是独一无二的 9999992222211111 是世界上唯一独一无二的价值 Val 列表,所以这是我不希望在最终结果中看到的唯一值。为了 Val2 ,唯一的重复值是 3175032612900004 所以我不希望看到最后的结果。
听起来您要做的是将原始条件应用于最终结果表(与原始数据表不同)。如果这正是您所追求的,那么您可以执行应用于原始表到新表的相同过程,在该过程中,您将得到所需的确切结果。
我把这些都记在下面的小提琴里了。您将看到两个输出查询,一个是您看到的结果,另一个是您想要的结果。如果这能回答你的问题,请告诉我!=)
这是我的小提琴:小提琴

vc6uscn9

vc6uscn910#

你的问题的答案是什么
我的问题有问题吗?
在样本3的注解2中
注2:对于具有重复值的ID222222和231233,这仍然是错误的,因为具有ID231233的列val2具有与ID606542和606541(317503261290004)相同的值,所以它不匹配第二个条件,该条件只有没有重复值
如果val2与集合外的另一个记录重复,则不能消除这些记录。因此,在查询中只需添加以下条件

AND tb.Val NOT IN (SELECT t.Val
               FROM table1 t 
               WHERE t.Val2 IN (SELECT Val2 FROM table1 GROUP BY Val2 HAVING count( Val2 ) > 1 ))

我已将此条件添加到您的查询中,并查看预期结果。见下面的小提琴
我的小提琴
@govind给出的答案感觉更像是对你的需求的重新书写。仅当val2列中没有重复项时,它才检查val列的重复项。非常简洁明了的查询。
戈文德的回答

ljo96ir5

ljo96ir511#

像这样的?

SELECT *
  FROM table1
 WHERE val IN
       (SELECT val
          FROM table1
         GROUP BY val
        HAVING COUNT(*) > 1 AND COUNT(DISTINCT val2) = COUNT(*))
   AND val NOT IN (SELECT t.val
                     FROM table1 t
                    INNER JOIN (SELECT val2
                                 FROM table1
                                GROUP BY val2
                               HAVING COUNT(*) > 1) x
                       ON x.val2 = t.val2);
gv8xihay

gv8xihay12#

`select val, count(*) from table1 group by val having count(*)>=2;`

`val                count(*)`

`1111111111111111   2`

`3173012102121018   2`

`3175031503131004   3`

`900                2`

val列至少有两个或多个重复值-true select val2, count(*) from table1 group by val2 having count(*)>1; ```
val2 count(*)
3175032612900004 3

val2列没有重复值(唯一)-false
所以理想情况下你应该 `no records found right` ?

相关问题