postgresql 按收入查找前3个产品类别,挑战是将其呈现为逗号分隔的字符串列表,按收入的降序排列

bpsygsoo  于 2023-03-29  发布在  PostgreSQL
关注(0)|答案(3)|浏览(101)

我有一个查询如下:

select date_trunc('day', ordertimestamp) as days 
, product
, revenue
from table

结果是:

days       |   product    | revenue

2022-07-10 |   shoes      |  $ 430
2022-07-10 |   chocolate  |  $ 10
2022-07-10 |   shirts     |  $ 3.25
2022-07-10 |   phones     |  $ 500
2022-01-02 |   vests      |  $ 200
2022-02-03 |   computers  |  $ 700

我需要以这样的方式放置此表,即按2022-07-10的收入排名前3位的产品类别出现在一行中,如下所示:

2022-07-10 |  phones, shoes, chocolate

正如你所看到的,它们是按收入降序排列的
我试过这样的方法:

SELECT
array(select products from dataset order by revenue desc limit 3) as top_three_product_categories_by_revenue
FROM table

但我不确定它是否准确

3okqufwl

3okqufwl1#

row_number()函数按组和顺序为每行分配数字。
rn<3,每个日期仅获得3个顶级产品。

with cte as (
  select days, product, row_number() over (partition by days order by revenue ASC) as rn
  from mytable
)
select days, string_agg(product, ', ' ORDER BY rn DESC) as top_products
from cte
where rn <= 3
group by days

Demo here

ryevplcw

ryevplcw2#

这段代码可以在DuckDB上运行,但也可以很容易地移植到其他数据库。

with cte_sequence as (
    select * from generate_series(1,100) as id
),
cte_base_data as (
    select
        id,
        timezone('utc', now()) - floor(random()*id*10000)::int) as order_datetime,
        'product' || floor(3*random())::int::text as product,
        random()*100 as revenue
    from
        cte_sequence
),
cte_top_products as (
    select
        date_trunc('day', order_datetime) as order_date,
        product,
        revenue,
        row_number() over(partition by order_date order by revenue desc) as row_id
    from
        cte_base_data
)
select
    order_date,
    string_agg(distinct product, ', ') as top_products_by_revenue
from
    cte_top_products
where
    row_id <= 3
group by 1
order by 1
w7t8yxp5

w7t8yxp53#

正如其他人在此期间发布的那样:
ROW_NUMBER(),按日期列划分ORDER BY revenue DESC,然后按结果行数小于4进行过滤,按日期和productSTRING_AGG()进行分组。

marco ~/1/Vertica/supp $ psql -af top3list.sql
WITH
indata(dt,product,revenue) AS (
          SELECT DATE '2022-07-10','shoes'    ,430
UNION ALL SELECT DATE '2022-07-10','chocolate',10
UNION ALL SELECT DATE '2022-07-10','shirts'   ,3.25
UNION ALL SELECT DATE '2022-07-10','phones'   ,500
UNION ALL SELECT DATE '2022-01-02','vests'    ,200
UNION ALL SELECT DATE '2022-02-03','computers',700
)
,
w_rn AS (
  SELECT
    *
  , ROW_NUMBER() OVER w AS rn
  FROM indata
  WINDOW w AS (PARTITION BY dt ORDER BY revenue DESC)
)
SELECT
  dt
, STRING_AGG(product,',' ORDER BY revenue DESC) AS top3
FROM w_rn
WHERE rn < 4
GROUP BY dt
;
     dt     |          top3          
------------+------------------------
 2022-01-02 | vests
 2022-02-03 | computers
 2022-07-10 | phones,shoes,chocolate

相关问题