max()无法按预期工作

lndjwyie 于 2021-07-26 发布在 Java

关注(0)|答案(2)|浏览(343)

当我试图用一组记录来获得一个字段的最大值时，我遇到了一些问题，我希望你们中的一些人能帮助我找出我做错了什么。
我想在一个订单中找到最贵商品的id。
给出这个问题：

SELECT 
       orderHeader.orderKey, orderLines.lineKey, orderLines.itemKey, orderLines.OrderedQty, 
       orderLines.price, (orderLines.price*orderLines.OrderedQty) as LinePrice, 
       ROW_NUMBER() OVER(PARTITION BY orderHeader.orderKey ORDER BY orderLines.lineKey asc) AS [ItemLineNum], 
       ROW_NUMBER() OVER(PARTITION BY orderHeader.orderKey ORDER BY (orderLines.price*orderLines.OrderedQty) DESC) AS [LineMaxPriceNum],
       max(orderLines.itemKey) OVER (PARTITION BY orderHeader.orderKey ORDER BY (orderLines.price*orderLines.OrderedQty) DESC) as [MaxPriceItem]
FROM
       orderHeader inner join orderLines on orderHeader.orderKey=orderLines.orderKey

我得到这个结果：查询结果
抱歉，因为我不能直接在文章中插入图像，所以我将尝试使用代码段来格式化表格。
这些就是结果

| orderKey | lineKey | itemKey | OrderedQty | Price | LinePrice | ItemLineNum | LineMaxPriceNum | MaxPriceItem |
|----------|---------|---------|------------|-------|-----------|-------------|-----------------|--------------|
| 176141   | 367038  | 15346   | 3          | 1000  | 3000      | 2           | 1               | 15346        |
| 176141   | 367037  | 15159   | 2          | 840   | 1680      | 1           | 2               | 15346        |
| 176141   | 367039  | 15374   | 5          | 100   | 500       | 3           | 3               | 15374        |

如您所见，对于同一个“orderkey”，我有三行（linekey），每行都有不同的项目（itemkey）、不同的数量、不同的价格和不同的总成本（lineprice）。我想在maxpriceitem列中输入“lineprice”较高的项的键，但结果是错误的。三行应该显示15346是最贵的商品，但最后一行不对，我不明白为什么。另外，由同一表达式（linemaxpricenum）划分的行号给了我正确的顺序。
如果我在max内更改order by的表达式，如以下所示（ordering by“orderedqty”）：

SELECT 
       orderHeader.orderKey, orderLines.lineKey, orderLines.itemKey, orderLines.OrderedQty, 
       orderLines.price, (orderLines.price*orderLines.OrderedQty) as LinePrice, 
       ROW_NUMBER() OVER(PARTITION BY orderHeader.orderKey ORDER BY orderLines.lineKey asc) AS [ItemLineNum], 
       ROW_NUMBER() OVER(PARTITION BY orderHeader.orderKey ORDER BY (orderLines.price*orderLines.OrderedQty) DESC) AS [LineMaxPriceNum],
       max(orderLines.itemKey) OVER (PARTITION BY orderHeader.orderKey ORDER BY orderLines.OrderedQty DESC) as [MaxPriceItem]
FROM
       orderHeader inner join orderLines on orderHeader.orderKey=orderLines.orderKey

然后它就起作用了：

| orderKey | lineKey | itemKey | OrderedQty | Price | LinePrice | ItemLineNum | LineMaxPriceNum | MaxPriceItem |
|----------|---------|---------|------------|-------|-----------|-------------|-----------------|--------------|
| 176141   | 367038  | 15346   | 3          | 1000  | 3000      | 2           | 1               | 15374        |
| 176141   | 367037  | 15159   | 2          | 840   | 1680      | 1           | 2               | 15374        |
| 176141   | 367039  | 15374   | 5          | 100   | 500       | 3           | 3               | 15374        |

“orderedqty”最高的项目是15374，因此结果是正确的。
如果我再次更改order by在max中的表达式，如下所示（按“价格”排序）：

SELECT 
       orderHeader.orderKey, orderLines.lineKey, orderLines.itemKey, orderLines.OrderedQty, 
       orderLines.price, (orderLines.price*orderLines.OrderedQty) as LinePrice, 
       ROW_NUMBER() OVER(PARTITION BY orderHeader.orderKey ORDER BY orderLines.lineKey asc) AS [ItemLineNum], 
       ROW_NUMBER() OVER(PARTITION BY orderHeader.orderKey ORDER BY (orderLines.price*orderLines.OrderedQty) DESC) AS [LineMaxPriceNum],
       max(orderLines.itemKey) OVER (PARTITION BY orderHeader.orderKey ORDER BY orderLines.price DESC) as [MaxPriceItem]
FROM
       orderHeader inner join orderLines on orderHeader.orderKey=orderLines.orderKey

与第一个例子相同，结果是错误的：

| orderKey | lineKey | itemKey | OrderedQty | Price | LinePrice | ItemLineNum | LineMaxPriceNum | MaxPriceItem |
|----------|---------|---------|------------|-------|-----------|-------------|-----------------|--------------|
| 176141   | 367038  | 15346   | 3          | 1000  | 3000      | 2           | 1               | 15346        |
| 176141   | 367037  | 15159   | 2          | 840   | 1680      | 1           | 2               | 15346        |
| 176141   | 367039  | 15374   | 5          | 100   | 500       | 3           | 3               | 15374        |

最高价格的商品是15346，但最后一个记录的最高价格没有显示这一点。
我错过了什么？为什么我会得到这些不同的结果？
很抱歉，如果格式没有正确完成，这是我在这里的第一个问题，我已经尽力了。
提前谢谢你能给我的任何帮助。

sql Join window-functions sql-server Max

来源：https://stackoverflow.com/questions/62173414/max-over-partition-by-not-working-as-intended

2条答案

按热度按时间

imzjd6km1#

我想在一个订单中找到最贵商品的id。
你误解了会议的目的 order by 窗口函数的子句；它是为了定义窗口框架，而不是比较值； max() 提供在窗口框架内作为参数提供的表达式的最大值。
另一方面，你想要 itemKey 最贵的订单行。我认为 first_value() 做你想做的事：

first_value(orderLines.itemKey) over(
    partition by orderHeader.orderKey 
    order by orderLines.price * orderLines.OrderedQty desc
) as [MaxPriceItem]

赞(0）回复(0）举报 2021-07-26

ldxq2e6h2#

公认的答案为原始问题提供了合理的替代解决方案，但并不能真正解释 max() 函数似乎工作不一致(和扰流警报，你实际上可以使用 max() 正如最初打算的那样，只是做了一个小小的调整。）
您必须了解聚合函数实际上是在分区内的窗口框架上操作的。默认情况下，框架是整个分区。所以聚合操作 max() 以及 sum() 在整个分区上运行，就像你假设的那样。此默认规范定义为 RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING . 这就意味着不管我们的记录是什么， max() 一直向后看分区中的第一行，一直向前看分区中的最后一行，以便计算值。
但有一个阴险的陷阱：添加一个 ORDER BY 子句将默认框架规范更改为 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW . 这意味着无论我们的记录是什么， max() 一直向后看分区中的第一行，然后只向上看当前行，以便计算值。您可以在上一个示例中清楚地看到这一点（简化了一点）：

SELECT orderKey, itemKey, price,
   ROW_NUMBER() OVER(PARTITION BY orderKey ORDER BY price DESC) AS [PartitionRowNum],
   MAX(itemKey) OVER (PARTITION BY orderKey ORDER BY price DESC) as [MaxPriceItem]
FROM orders

结果/说明：

| orderKey | itemKey | Price | PartitionRowNum | MaxPriceItem | Commentary             |
|----------|---------|-------|-----------------|--------------|------------------------|
| 176141   | 15346   | 1000  | 1               | 15346        | Taking max of rows 1-1 |
| 176141   | 15159   | 840   | 2               | 15346        | Taking max of rows 1-2 |
| 176141   | 15374   | 100   | 3               | 15374        | Taking max of rows 1-3 |

解决方案

我们可以通过添加 RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING 分区如下：

SELECT orderKey, itemKey, price,
   ROW_NUMBER() OVER(PARTITION BY orderKey ORDER BY price DESC) AS [PartitionRowNum],
   MAX(itemKey) OVER (PARTITION BY orderKey ORDER BY price DESC RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as [MaxPriceItem]
FROM orders

结果/说明：

| orderKey | itemKey | Price | PartitionRowNum | MaxPriceItem | Commentary             |
|----------|---------|-------|-----------------|--------------|------------------------|
| 176141   | 15346   | 1000  | 1               | 15374        | Taking max of rows 1-3 |
| 176141   | 15159   | 840   | 2               | 15374        | Taking max of rows 1-3 |
| 176141   | 15374   | 100   | 3               | 15374        | Taking max of rows 1-3 |

赞(0）回复(0）举报 2021-07-26

我来回答

max()无法按预期工作

2条答案

解决方案

相关问题

热门标签

最新问答