我试图提出一个可以用书籍和作者来表达的要求,如下所示:
CREATE TABLE author (id BIGSERIAL UNIQUE NOT NULL, name TEXT NOT NULL);
CREATE TABLE book (id BIGSERIAL UNIQUE NOT NULL, title TEXT NOT NULL, author_id BIGINT NOT NULL REFERENCES author(id));
CREATE TABLE fan (id BIGSERIAL UNIQUE NOT NULL, name TEXT NOT NULL, book_id BIGINT NOT NULL REFERENCES book(id));
experiments=# SELECT * FROM author;
id | name
----+--------------------
2 | Johnathan Williams
3 | Lewis Carroll
4 | Guy Fox
(3 rows)
experiments=# SELECT * FROM book;
id | title | author_id
----+---------------------+-----------
1 | Fairy tales | 2
2 | Alice in Wonderland | 3
3 | Some other book | 4
4 | The final book | 3
5 | Some other book | 4
(5 rows)
experiments=# SELECT * FROM fan;
id | name | book_id
----+-------+---------
1 | Alex | 1
2 | Alice | 2
3 | Jimmy | 3
4 | James | 4
5 | Nate | 3
6 | Mary | 2
7 | Alex | 1
8 | Mary | 2
(8 rows)
要重新创建数据库,可以使用以下代码:
CREATE TABLE author (id BIGSERIAL UNIQUE NOT NULL, name TEXT NOT NULL);
CREATE TABLE book (id BIGSERIAL UNIQUE NOT NULL, title TEXT NOT NULL, author_id BIGINT NOT NULL REFERENCES author(id));
CREATE TABLE fan (id BIGSERIAL UNIQUE NOT NULL, name TEXT NOT NULL, book_id BIGINT NOT NULL REFERENCES book(id));
INSERT INTO author (name, id) VALUES ('Johnathan Williams', 2), ('Lewis Carroll', 3), ('Guy Fox', 4);
INSERT INTO book (title, author_id) VALUES ('Fairy tales', 2), ('Alice in Wonderland', 3), ('Some other book', 4), ('The final book', 3), ('Some other book', 4);
INSERT INTO fan (name, book_id) VALUES ('Alex', 1), ('Alice', 2), ('Jimmy', 3), ('James', 4), ('Nate', 3), ('Mary', 2);
INSERT INTO fan (name, book_id) VALUES ('Alex', 1), ('Mary', 2);
现在,我需要作者连同他们的书名和书迷的数组。我就是这么做的
SELECT
(SELECT array_agg(author.name))[1] AS author_name,
author.id AS author_id,
array_agg(fan.name ORDER BY fan.id) AS fan_names,
array_agg(DISTINCT fan.id ORDER BY fan.id) AS fan_ids,
array_agg(book.title ORDER BY book.id) AS book_titles,
array_agg(DISTINCT book.id ORDER BY book.id) AS book_ids
FROM
author JOIN book on author.id=book.author_id JOIN fan ON fan.book_id=book.id
GROUP BY author.id;
这就是我得到的
author_name | author_id | fan_names | fan_ids | book_titles | book_ids
--------------------+-----------+-------------------------+-----------+--------------------------------------------------------------------------------------+----------
Johnathan Williams | 2 | {Alex,Alex} | {1,7} | {"Fairy tales","Fairy tales"} | {1}
Lewis Carroll | 3 | {Alice,James,Mary,Mary} | {2,4,6,8} | {"Alice in Wonderland","Alice in Wonderland","Alice in Wonderland","The final book"} | {2,4}
Guy Fox | 4 | {Jimmy,Nate} | {3,5} | {"Some other book","Some other book"} | {3}
当然,书单中也有重复的内容,因为与fans的连接将行相乘。我需要去除“人造”的复制品,但我不能简单地 array_agg(DISTINCT)
,因为,正如你所见,一个作者出版了多本同名的书,我需要保留这些信息(我知道作者不应该这样做,但这只是一个模型!)
我找到了两种解决方法:
首先想到的方法是通过嵌套的请求+单独的fans连接/聚合来实现。我希望避免对整个表进行嵌套查询,因为使用额外的连接会大大降低速度。
另一种方式,有点明显,也有点难看,我可以加载所有的副本-并用我发送请求的语言编程进行过滤(例如,我有5个ID,20个名称-只需选择每4个名称)-但这看起来非常错误,我觉得一个好的解决方案是如此接近,但我只是看不到它。
除了上述解决方案一般不具吸引力之外,在更复杂的情况下,还有不止两个连接—因此重复的数量可以增加到数百个,嵌套查询的数量可以增加到10-15个。即使我允许在这个简单的例子中出现一些不好的地方,但在实际的例子中代价会很高,特别是如果我以后需要添加额外的连接的话。
有没有一个简单的解决方案我错过了?感觉就像我在一排排的 array_agg(DISTINCT ON (fan.id) fan.name)
,当然,这不起作用,因为它不是有效的sql。
事先谢谢你的帮助。
1条答案
按热度按时间esyap4oy1#
你可以用横向连接做你想做的。虽然我不认为这有道理: