get array\u agg根据另一列筛选重复项

hiz5n14c  于 2021-07-26  发布在  Java
关注(0)|答案(1)|浏览(350)

我试图提出一个可以用书籍和作者来表达的要求,如下所示:

CREATE TABLE author (id BIGSERIAL UNIQUE NOT NULL, name TEXT NOT NULL);
CREATE TABLE book (id BIGSERIAL UNIQUE NOT NULL, title TEXT NOT NULL, author_id BIGINT NOT NULL REFERENCES author(id));
CREATE TABLE fan (id BIGSERIAL UNIQUE NOT NULL, name TEXT NOT NULL, book_id BIGINT NOT NULL REFERENCES book(id));

experiments=# SELECT * FROM author;
 id |        name
----+--------------------
  2 | Johnathan Williams
  3 | Lewis Carroll
  4 | Guy Fox
(3 rows)

experiments=# SELECT * FROM book;
 id |        title        | author_id
----+---------------------+-----------
  1 | Fairy tales         |         2
  2 | Alice in Wonderland |         3
  3 | Some other book     |         4
  4 | The final book      |         3
  5 | Some other book     |         4
(5 rows)

experiments=# SELECT * FROM fan;
 id | name  | book_id
----+-------+---------
  1 | Alex  |       1
  2 | Alice |       2
  3 | Jimmy |       3
  4 | James |       4
  5 | Nate  |       3
  6 | Mary  |       2
  7 | Alex  |       1
  8 | Mary  |       2
(8 rows)

要重新创建数据库,可以使用以下代码:

CREATE TABLE author (id BIGSERIAL UNIQUE NOT NULL, name TEXT NOT NULL);
CREATE TABLE book (id BIGSERIAL UNIQUE NOT NULL, title TEXT NOT NULL, author_id BIGINT NOT NULL REFERENCES author(id));
CREATE TABLE fan (id BIGSERIAL UNIQUE NOT NULL, name TEXT NOT NULL, book_id BIGINT NOT NULL REFERENCES book(id));
INSERT INTO author (name, id) VALUES ('Johnathan Williams', 2), ('Lewis Carroll', 3), ('Guy Fox', 4);
INSERT INTO book (title, author_id) VALUES ('Fairy tales', 2), ('Alice in Wonderland', 3), ('Some other book', 4), ('The final book', 3), ('Some other book', 4);
INSERT INTO fan (name, book_id) VALUES ('Alex', 1), ('Alice', 2), ('Jimmy', 3), ('James', 4), ('Nate', 3), ('Mary', 2);
INSERT INTO fan (name, book_id) VALUES ('Alex', 1), ('Mary', 2);

现在,我需要作者连同他们的书名和书迷的数组。我就是这么做的

SELECT
  (SELECT array_agg(author.name))[1] AS author_name,
  author.id AS author_id,
  array_agg(fan.name ORDER BY fan.id) AS fan_names,
  array_agg(DISTINCT fan.id ORDER BY fan.id) AS fan_ids,
  array_agg(book.title ORDER BY book.id) AS book_titles,
  array_agg(DISTINCT book.id ORDER BY book.id) AS book_ids
 FROM
  author JOIN book on author.id=book.author_id JOIN fan ON fan.book_id=book.id
GROUP BY author.id;

这就是我得到的

author_name     | author_id |        fan_names        |  fan_ids  |                                     book_titles                                      | book_ids
--------------------+-----------+-------------------------+-----------+--------------------------------------------------------------------------------------+----------
 Johnathan Williams |         2 | {Alex,Alex}             | {1,7}     | {"Fairy tales","Fairy tales"}                                                        | {1}
 Lewis Carroll      |         3 | {Alice,James,Mary,Mary} | {2,4,6,8} | {"Alice in Wonderland","Alice in Wonderland","Alice in Wonderland","The final book"} | {2,4}
 Guy Fox            |         4 | {Jimmy,Nate}            | {3,5}     | {"Some other book","Some other book"}                                                | {3}

当然,书单中也有重复的内容,因为与fans的连接将行相乘。我需要去除“人造”的复制品,但我不能简单地 array_agg(DISTINCT) ,因为,正如你所见,一个作者出版了多本同名的书,我需要保留这些信息(我知道作者不应该这样做,但这只是一个模型!)
我找到了两种解决方法:
首先想到的方法是通过嵌套的请求+单独的fans连接/聚合来实现。我希望避免对整个表进行嵌套查询,因为使用额外的连接会大大降低速度。
另一种方式,有点明显,也有点难看,我可以加载所有的副本-并用我发送请求的语言编程进行过滤(例如,我有5个ID,20个名称-只需选择每4个名称)-但这看起来非常错误,我觉得一个好的解决方案是如此接近,但我只是看不到它。
除了上述解决方案一般不具吸引力之外,在更复杂的情况下,还有不止两个连接—因此重复的数量可以增加到数百个,嵌套查询的数量可以增加到10-15个。即使我允许在这个简单的例子中出现一些不好的地方,但在实际的例子中代价会很高,特别是如果我以后需要添加额外的连接的话。
有没有一个简单的解决方案我错过了?感觉就像我在一排排的 array_agg(DISTINCT ON (fan.id) fan.name) ,当然,这不起作用,因为它不是有效的sql。
事先谢谢你的帮助。

esyap4oy

esyap4oy1#

你可以用横向连接做你想做的。虽然我不认为这有道理:

SELECT a.author_name, a.id AS author_id,
        f.fan_names, f.fan_ids, b.book_titles, b.book_ids
FROM author a CROSS JOIN LATERAL
     (SELCT ARRAY_AGG(b.title ORDER BY b.id) as book_titles,
            ARRAY_AGG(b.id ORDER BY b.id) as book_ids
      FROM book b
      WHERE a.id = b.author_id
     ) b CROSS JOIN LATERAL
     (SELECT ARRAY_AGG(f.name ORDER BY f.id) as fan_names,
             ARRAY_AGG(f.id ORDER BY f.id) as fan_ids
      FROM fan f
      WHERE f.book_id = b.id
     ) f;

相关问题