mysql SQL查询仅在成员共享精确值时才选择成员对[重复]

sd2nnvve  于 2023-05-16  发布在  Mysql
关注(0)|答案(4)|浏览(109)

此问题已在此处有答案

SQL query for finding pairs that share the same set of values(2个答案)
15小时前关门了。
我有以下表成员:
| ID|嗜好|
| --------------|--------------|
| 1|足球|
| 1|网球|
| 1|足球|
| 二|纸牌|
| 二|绘画|
| 三|网球|
| 三|足球|
| 四个|纸牌|
我想选择对成员只有当他们有完全相同的爱好(没有重复)。所以在上表中,我希望查询输出:
| ID1| ID2|
| --------------|--------------|
| 1|三|
我的查询:

SELECT m1.id as id1 , m2.id as id2
FROM members m1 inner join members m2
ON m1.id < m2.id
WHERE m1.hobby in (
  SELECT distinct(m2.hobby)
  )
GROUP BY id1,id2

但我得到:
| ID1| ID2|
| --------------|--------------|
| 1|三|
| 二|四个|

wnvonmuf

wnvonmuf1#

这样做的一种方式是:

  • 计算每个ID有多少独特的爱好
  • 在不同的id上自连接以捕获匹配的爱好和爱好的数量
  • 确保爱好计数等于每个id的匹配记录计数
WITH cte AS (
    SELECT ID, 
           hobby,
           COUNT(hobby) OVER(PARTITION BY ID) AS cnt
    FROM tab 
    GROUP BY ID, 
             hobby
)
SELECT t1.ID AS id1, 
       t2.ID AS id2
FROM       cte t1
INNER JOIN cte t2
        ON t1.ID < t2.ID 
       AND t1.hobby = t2.hobby
       AND t1.cnt = t2.cnt
GROUP BY t1.ID, t2.ID, t1.cnt
HAVING COUNT(*) = t1.cnt

输出

ID1ID2
1

检查演示here

9rnv2umw

9rnv2umw2#

你可以通过使用GROUP_CONCAT将id按爱好分组,然后使用SUBSTRING_INDEX拆分连接的对来实现这一点:
此查询将返回具有多个成员的爱好:

SELECT pairs
FROM (
  select hobby, GROUP_CONCAT(DISTINCT ID) as pairs
  from members
  group by hobby
) as s
GROUP BY pairs
HAVING COUNT(pairs) > 1

结果:

pairs
1,3

逗号分隔的对将在最终查询中转换为列:

WITH cte as (
  SELECT pairs
  FROM (
    select hobby, GROUP_CONCAT(DISTINCT ID) as pairs
    from members
    group by hobby
  ) as s
  GROUP BY pairs
  HAVING COUNT(pairs) > 1
)
select SUBSTRING_INDEX(pairs, ',', 1) AS ID1,
       SUBSTRING_INDEX(pairs, ',', -1) AS ID2
from cte

结果:

ID1 ID2
1   3

Demo here

lqfhib0f

lqfhib0f3#

with data (id,hobby) as (
    select 1, 'Tennis'  union all
    select 1, 'Football'  union all
    select 1, 'Football'  union all
    select 2, 'Cards'  union all
    select 2, 'Painting'  union all
    select 3, 'Tennis'  union all
    select 3, 'Football'  union all
    select 4, 'Cards' union all
    select 5, 'Tennis' union all
    select 5, 'Football' union all
    select 5, 'Cards'
)
, udata(id,hobby) as (
    select distinct id, hobby 
    from data
)
, cdata(id, n) as (
    select id, count(distinct hobby) as n
    from data
    group by id
)
select id1, id2 from (
    select u1.id as id1, u2.id as id2, count(*) as n, 
      c1.n as no1, c2.n as no2
    from udata u1
    join udata u2 on u2.id > u1.id and u1.hobby = u2.hobby
    join cdata c1 on c1.id = u1.id
    join cdata c2 on c2.id = u2.id
    group by u1.id, u2.id
) t
where n = no1 and n = no2
;

(you可以在udata中添加count(distinct hobby)over(partition by id)作为n,并在u1和u2之间的JOIN中添加n的条件,但MySQL还不支持count distinct over partition...)

6bc51xsx

6bc51xsx4#

一个简单的方法是使用字符串聚合。这个想法是建立一个列表的所有爱好的每个成员;然后,我们可以自连接结果以生成共享完全相同的列表的用户对。

with cte as (
    select id, group_concat(distinct hobby order by hobby) hobbies
    from members
    group by id
)
select c1.id as id1, c2.id as id2, c1.hobbies
from cte c1
inner join cte c2 on c1.hobbies = c2.hobbies and c1.id < c2.id

请注意,在列表中对爱好进行 * 排序 * 是很重要的,这样它们就可以进行一致的比较。
| ID1| ID2|业余爱好|
| --------------|--------------|--------------|
| 1|三|足球、网球|
fiddle

相关问题