Using SQL Server 2017, I encountered a query that was sorting on a VARCHAR column returning results in a different order than I expected, namely a string followed by a tab came before just the bare string. Further investigation showed that it was behaving as if the bare string had a space on the end.
Experimenting further, given the following:
SELECT * FROM (VALUES
(1, 'a' + CHAR(33)),
(2, 'a '),
(3, 'a '),
(4, 'a '),
(5, 'a' + CHAR(31)),
(6, 'a')
)v(rownum, text)
ORDER BY text
I would have expected the results to be:
rownum | text |
---|---|
6 | 'a' |
5 | 'a(31)' |
4 | 'a(sp)' |
3 | 'a(sp)(sp)' |
2 | 'a(sp)(sp)(sp)' |
1 | 'a(33)' |
But instead it was:
rownum | text |
---|---|
5 | 'a(31)' |
3 | 'a(sp)(sp)' |
4 | 'a(sp)' |
2 | 'a(sp)(sp)(sp)' |
6 | 'a' |
1 | 'a(33)' |
So it seems as if it treats 'a'+(no spaces) and 'a'+(any amount of spaces) as if they were all equivalent. What is going on here?
2条答案
按热度按时间u2nhd7ah1#
Because SQL Server follows ANSI standard string comparison rules, so 'a' and 'a ' are equal. eg
outputs
And the order in which rows are returned that are equal in the ORDER BY clause is undefined, and plan-dependent.
l0oc07j22#
So it seems as if it treats 'a'+(no spaces) and 'a'+(any amount of spaces) as if they were all equivalent.
This is exactly what is going on. When dealing with equality operations SQL Server effectively ignores trailing spaces, so
''
and' '
are seen as equal, and so too are'a'
,'a '
and'a '
.As a result the
ORDER BY
you have,ORDER BY text
, isn't deterministic; you have 4 rows that all have the same value. This means that when you run the query the 4 rows with the same value are returned in an arbitrary order. If you wanted to ensure you ordered values with the least trailing spaces first then you could addDATALENGTH(text)
to theORDER BY
, as unlikeLEN
it does include trailing spaces in its calculation:To get the order you were expecting, you'd want to order the data by it's binary value: