SQL Server Different results for COUNT using as window function with and without ORDER BY

pbpqsu0x  于 2023-02-28  发布在  其他
关注(0)|答案(1)|浏览(142)

Say, I have this simple table:

declare @t table ([period] date, [contract] int, price int);
insert into @t values
('2022-11-01', 1, 234),
('2022-12-01', 1, 277),
('2023-01-01', 1, 489),
('2022-11-01', 2, 187),
('2022-12-01', 2, 598),
('2022-03-01', 2, 478);

When using it as window function with and without ORDER BY , I get different results. When I used ORDER BY , I was surprised with the results since I expected all values in count column to be 3 .

select     *,
       count = count(*) over (partition by [contract] order by [period])
from @t;
periodcontractpricecount
2022-11-0112341
2022-12-0112772
2023-01-0114893
2022-11-0124781
2022-12-0121872
2023-01-0125983

I couldn't understand, why this was happening. Then I decided to remove order by [period] clause. And then all worked like a charm!

select     *,
       count = count(*) over (partition by [contract])
from @t;
periodcontractpricecount
2022-11-0112343
2022-12-0112773
2023-01-0114893
2022-11-0124783
2022-12-0121873
2023-01-0125983

After that I decided to take a look at the documentation, and it says:
The order_by_clause determines the logical order of the operation. See OVER clause (Transact-SQL) for more information.

And thatOVER link says:

ROWS/RANGE that limits the rows within the partition by specifying start and end points within the partition. It requires ORDER BY argument and the default value is from the start of partition to the current element if the ORDER BY argument is specified.

Do I correctly understand that as soon as I specify ORDER BY , it acts like BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ? Then when I omit it, why it acts as BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ?

a11xaf1n

a11xaf1n1#

The documentation is quite clear on this further down the same page.
ORDER BY

ORDER BY *order_by_expression* [COLLATE *collation_name*] [ASC|DESC]

Defines the logical order of the rows within each partition of the result set. That is, it specifies the logical order in which the window function calculation is performed.

  • If it is not specified, the default order is ASC and window function will use all rows in partition.
  • If it is specified, and a ROWS / RANGE is not specified, then default RANGE UNBOUNDED PRECEDING AND CURRENT ROW is used as default for window frame by the functions that can accept optional ROWS / RANGE specification (for example min or max).

So by default you get a running sum. This default is only if you specify ORDER BY , otherwise the window calculation is over the whole partition (ie not running).

Note that ROWS and RANGE are subtly different, and annoyingly 99% of the time you want ROWS for a running sum so make sure to specify it.

相关问题