如何从PostgreSQL INNER JOIN和UNION ALL查询中获取具有NULL值的表?

ylamdve6  于 2024-01-07  发布在  PostgreSQL
关注(0)|答案(1)|浏览(120)

我用inner join查询一个时间序列图。一个标签列表被用作inner joinwhere条件。我用union all得到一个结果表,其中包含两种情况的数据:一个查询的标签在列表中,一个查询的标签列表为空。我希望行的null值从第一个select之前的union all,如果标签list为空,但PostgreSQL返回SELECT 0,即根本没有行。我想要null行,因为没有它们,我无法以编程方式分离大小写。
我如何得到null行的结果?或者如果标签列表不为空,是否有一种方法可以只从union all之前的select获得结果,如果标签列表为空,则只从union all之后的select获得结果?如果这些都不可能实现,然后建议如何从当前查询中提取这两个案例的数据或修改它是非常受欢迎的。没有必要处理餐馆有多个标签:一家餐馆要么有一个标签,要么没有任何标签。
查询示例,带有伪值和注解的表在这里:db<>fiddle

-- I want null values from this first SELECT before UNION ALL unless
-- a more elegant/efficient solution is possible. But I get 'SELECT 0'.
SELECT 
  f.days AS days,
  SUM (f.waste_kgs) AS waste,
  SUM (f.prepared_kgs) AS prepared,
  f.waste_type_id AS waste_id,
  f.restaurant_id AS restaurant
FROM food f
INNER JOIN tags t
  ON f.restaurant_id = t.restaurant_id
WHERE t.name = ANY (array[]::text[]) -- Empty tag list as an argument.
  AND f.days BETWEEN '2000-1-1' and '2099-1-1'
  AND f.restaurant_id = ANY (array[1, 2, 3])
GROUP BY days, restaurant, waste_id 
-- Separately running the first SELECT returns
-- days     waste   prepared    waste_id    restaurant
-- SELECT 0

UNION ALL 
-- The same query without the INNER JOIN and tag list argument. 
SELECT
  f.days AS days,
  SUM (f.waste_kgs) AS waste,
  SUM (f.prepared_kgs) AS prepared,
  f.waste_type_id AS waste_id,
  f.restaurant_id AS restaurant
FROM food f
WHERE f.days BETWEEN '2000-1-1' and '2099-1-1'
  AND f.restaurant_id = ANY (array[1, 2, 3])
GROUP BY days, restaurant, waste_id;

-- The whole query returns
-- days       waste     prepared    waste_id    restaurant
-- 2023-01-01   2.5        16.0           1              1
-- 2023-01-02   8.6         7.3           1              2
-- 2023-01-03   10.5        1.8           1              3
-- 2023-01-03   0.8         0.0           2              3
-- SELECT 4
-- No null values to differentiate the empty tag list argument case.

字符串
下面是示例中使用的来自db<>fiddle的表和值。

CREATE TABLE restaurants (
  id int PRIMARY KEY,
  name text,
  type text
);
CREATE TABLE food (
  id int PRIMARY KEY,
  restaurant_id int REFERENCES restaurants (id),
  waste_type_id smallint NOT NULL,
  product_id int NOT NULL,
  waste_kgs decimal NOT NULL, 
  prepared_kgs decimal NOT NULL,
  customers smallint NOT NULL,
  days date NOT NULL
);
CREATE TABLE tags (
  id int PRIMARY KEY,
  restaurant_id int REFERENCES restaurants (id),
  name text
);
INSERT INTO restaurants VALUES
(1, 'restaurant_1'),
(2, 'restaurant_2'),
(3, 'restaurant_3');
INSERT INTO food VALUES 
(1, 1, 1, 1, 1.7, 8.0, 96, '2023-1-1'),
(2, 1, 1, 10, 0.5, 7.0, 96, '2023-1-1'),
(3, 1, 1, 15, 0.3, 1.0, 96, '2023-1-1'),
(4, 2, 1, 12, 7.0, 0.8, 39, '2023-1-2'),
(5, 2, 1, 10, 1.1, 5.0, 39, '2023-1-2'),
(6, 2, 1, 11, 0.5, 1.5, 39, '2023-1-2'),
(7, 3, 1, 8, 10.0, 0.3, 97, '2023-1-3'),
(8, 3, 2, 17, 0.8, 0.0, 97, '2023-1-3'),
(9, 3, 1, 11, 0.5, 1.5, 39, '2023-1-3');
INSERT INTO tags VALUES
(1, 1, 'tag_1');


下面的查询来自相同的表,带有一个空的标签列表,给出了一个表,其中包含来自inner joinnull值,所以我可以将它与union all一起使用-就像我试图对上面的查询所做的那样-并轻松地分离结果。

-- The first row of the result displays calculations from the tagged restaurants. 
-- In this case it has null values because tag list is empty.
-- The second row of the result displays calculations from all the restaurants without
-- considering any tags.
SELECT Total, Total - Drinks AS "Without drinks", Drinks   
FROM (
  SELECT 
  SUM (f.prepared_kgs) 
  FILTER (
    WHERE f.days BETWEEN '2000-1-1' AND '2099-1-1'
    AND f.restaurant_id = ANY (array[1, 2, 3])
  ) AS Total,
  SUM (f.prepared_kgs)
  FILTER (
    WHERE (f.product_id = 10 OR f.product_id = 17) 
    AND f.days BETWEEN '2000-1-1' AND '2099-1-1'
    AND f.restaurant_id = ANY (array[1, 2, 3])
  ) AS Drinks
  FROM food f
  INNER JOIN tags t 
    ON t.restaurant_id = f.restaurant_id 
  WHERE t.name = ANY (array[]::text[]) -- Empty tag list as an argument.  
)
UNION ALL
-- The same query without the INNER JOIN and tag list argument.
SELECT Total, Total - Drinks AS "Without drinks", Drinks   
FROM (
  SELECT 
  SUM (f.prepared_kgs) 
  FILTER (
    WHERE f.days BETWEEN '2000-1-1' AND '2099-1-1'
    AND f.restaurant_id = ANY (array[1, 2, 3])
  ) AS Total,
  SUM (f.prepared_kgs)
  FILTER (
    WHERE (f.product_id = 10 OR f.product_id = 17) 
    AND f.days BETWEEN '2000-1-1' AND '2099-1-1'
    AND f.restaurant_id = ANY (array[1, 2, 3])
  ) AS Drinks
  FROM food f
);
-- If tag list had elements, my program uses the first row.
-- Otherwise it uses the second row. Time series query returns 
-- rows for each day but the logic I try to use is the same.
total   Without drinks  drinks
 null             null    null
 25.1             13.1    12.0
SELECT 2

pu3pd22g

pu3pd22g1#

如何从当前查询中提取两个案例的数据

SELECT 1 AS query, columns... FROM query1
UNION ALL    
SELECT 2 AS query, columns... FROM query2

字符串
结果将包含一列“query”,其值为1或2,具体取决于行来自UNION ALL的哪个分支。
但是,在你的情况下,我认为有一个更好的(和更快的)解决方案。
第一个查询在第二个查询的子集上进行聚合。这里是一个简化的版本,因为我懒得复制所有的列和表:

SELECT group, sum(qty) FROM table WHERE key IN (...) GROUP BY group
UNION ALL
SELECT group, sum(qty) FROM table GROUP BY group


“key in(...)”是你的标记条件,但它可以是任何布尔条件。如果你这样做:

SELECT group, key IN (...) AS flag, sum(qty) FROM table GROUP BY group, flag


然后结果将包含相同的数据,但组织方式不同:一行是满足条件的总和,另一行是不满足条件的总和。要获得总和,必须将应用程序中这两行的总和相加。
您也可以使用ROLLUP,在这种情况下,postgres将完成添加。
这使得查询更加简单,但是如果您的应用程序代码需要特定的格式,并且您无法更改它,那么它可能无法工作。尽管您可以将上述查询放在物化CTE中并重复使用两次,以按照您需要的方式格式化结果。
在我看来,数据库的工作是尽可能快地返回数据,而选择数据返回的格式是优化的一部分。换句话说,格式化不是数据库的工作,而是应用程序的工作。因此,正如建议的那样,我会将一些逻辑移到应用程序中。

相关问题