postgresql 在sql中查找购买n个产品所需的订单数

mzsu5hc0  于 2023-06-22  发布在  PostgreSQL
关注(0)|答案(2)|浏览(164)

我有一个样本数据集

order_num    order_date    email             product_id
    101        2023-06-01   xyz@gmail.com            1
    101        2023-06-01   xyz@gmail.com            2
    222        2023-06-02   xyz@gmail.com            1
    333        2023-06-03   xyz@gmail.com            3
    434        2023-06-05   xyz@gmail.com            3
    444        2023-06-01   abc@gmail.com            1
    444        2023-06-01   abc@gmail.com            2
    677        2023-06-02   abc@gmail.com            3

要购买product_id的1、23,用户xyz@gmail.com需要3个订单(101、222、333),用户abc@gmail.com需要2个订单。
结束表:

email               orders_to_cover_all_products
xyz@gmail.com                   3
abc@gmail.com                   2

如何使用sql来实现这一点?
我尝试的查询:

select email max(_row_number) from (
select *, row_number() over (partition by email, product_id order by order_num) _row_number from orders) where _row_number <=3 group by email;
2hh7jdfx

2hh7jdfx1#

从每个客户端的第一个订单开始,您希望计算至少购买一次所有三种产品所花费的订单数量。
一种方法使用窗口函数来聚合阵列中每个客户端的产品和订单。然后,我们可以对产品列表中的行进行筛选。由于Postgres中的窗口函数不支持distinct,因此需要一个额外的步骤来取消顺序数组的嵌套并计算不同的值:

select o.email, min(x.cnt_orders) cnt_orders
from (
    select o.*,
        array_agg(product_id) over(partition by email order by order_date, order_num) product_ar,
        array_agg(order_num ) over(partition by email order by order_date, order_num) order_ar
    from orders o
) o
cross join lateral (
    select count(distinct x.order_num) as cnt_orders from unnest(o.order_ar) as x(order_num)
) x
where array[1, 2, 3] <@ product_ar
group by o.email

| 电子邮件|cnt订单|
| - -----|- -----|
| abc@gmail.com | 2|
| xyz@gmail.com | 3|
fiddle

xpszyzbs

xpszyzbs2#

一种解决方法是:

with your_table (order_num, order_date, email, product_id) as (

    VALUES
        (101, '2023-06-01', 'xyz@gmail.com', 1),
        (101, '2023-06-01', 'xyz@gmail.com', 2),
        (222, '2023-06-02', 'xyz@gmail.com', 1),
        (333, '2023-06-03', 'xyz@gmail.com', 3),
        (434, '2023-06-05', 'xyz@gmail.com', 3),
        (444, '2023-06-01', 'abc@gmail.com', 1),
        (444, '2023-06-01', 'abc@gmail.com', 2),
        (677, '2023-06-02', 'abc@gmail.com', 3)
),
meta as (
select
    email,
    product_id,
    min(order_date) order_date
from
    your_table
group by
    email,
    product_id
),
maxMeta as (
select
    email,
    max(order_date) last_date_All
from
    meta
group by
    email

)

select
    o.email,
    count(distinct order_num) orders_to_cover_all_products
from
    your_table o
join maxMeta m on
    m.email = o.email
    and o.order_date <= last_date_all
group by
    o.email

相关问题