SQL Server Views | Inline View Expansion Guidelines

djmepvbi  于 2023-05-28  发布在  SQL Server
关注(0)|答案(1)|浏览(149)

Background

Hello all!

I recently learned that in newer versions of SQL Server, the query optimizer can "expand" a SQL view and utilize inline performance benefits. This could have some drastic effects going forward on what kinds of database objects I create and why and when I create them, depending upon when this enhanced performance is achieved and when it is not.

For instance, I would not bother creating a parameterized inline table-valued function with a start date parameter and an end date parameter for an extremely large transaction table (where performance matters greatly) when I can just make a view and slap a WHERE statement at the bottom of the calling query, something like

SELECT
     Column1
FROM vw_Simple
WHERE
     Column1 BETWEEN @SomeStartDate AND @SomeEndDate

and trust that the query optimizer will "expand" the view and give me a great execution plan.

Note: I am talking about a simple, non-nested, non-indexed SQL Server view. Something like

CREATE VIEW vw_Simple
AS
SELECT
    Column1
    ,Column2
FROM TableA

Question

My question is: what are the exact guidelines for knowing when the query optimizer can "expand" a SQL view and when it cannot?

I cannot find this answer in the official Microsoft documentation.

What I've found so far

Cases where the query optimizer can expand a view:

Cases where the query optimizer cannot expand a view:

Gray Area

  • This Stack Overflow post's second answer claims that the query optimizer may or may not expand the view, depending on the view's complexity and the limitations of the query optimizer.
9w11ddsr

9w11ddsr1#

You will not find this information in the documentation, because it is not a single feature per se, it is simply the compiler/optimizer working its way through the query in various phases, using a number of different techniques to get the best execution plan. Sometimes it can safely push through predicates, sometimes it can't.

Note that "expanding the view" is the wrong term here. The view is always expanded into its definition ( NOEXPAND excepted). What you are referring to is called predicate pushdown.

What happens to a view during compilation?

I've assumed here that indexed views and NOEXPAND are not being used. Indexed views are a whole other ball-game .

When you execute a query, the compiler starts by parsing and lexing the query into a basic execution plan. This is a very rough, unoptimized version which pretty much mirrors the query as written.

When there is a view in the query, the compiler retrieves the view's pre-parsed execution tree and shoves it into the execution plan, again it is a very rough draft.

With derived tables, CTEs, correlated and non-correlated subqueries, as well as inline TVFs, the same thing happens, except that parsing is needed also.

After this point, you can assume that a view may as well have been written as a CTE, it makes no difference.

Can the optimizer push through the view?

The compiler has a number of tricks up its sleeve, and predicate pushdown is one of them, as is simplifying views.

The ability of the compiler here is mainly dependent on whether it can deduce that a simplification is permitted, not that it is possible.

For example, this query

SELECT SomeCol
FROM (
    SELECT TOP 100 PERCENT *
    FROM (
        SELECT SomeCol, OtherCol, 1 / 0 AS ThisDoesntError
        FROM table1
    ) t
    WHERE OtherCol = 1
    ORDER BY ThisDoesntError
) t
WHERE OtherCol <> 2

is fairly trivial to optimize away to this

SELECT SomeCol
FROM table1
WHERE OtherCol = 1

because TOP 100 PERCENT... ORDER BY... is known to be something that will make no difference to the outer query, and can therefore be dropped, followed by the whole ThisDoesntError column which is now not being used anywhere. The two predicates on OtherCol can be combined into one. The above query therefore does not generate a divide-by-zero error.

So when does it not work?

The problems start when the optimizer cannot push through the view, because it may change the semantics (and therefore the results) of the query.

SELECT SomeCol
FROM (
    SELECT TOP 10 *
    FROM (
        SELECT SomeCol, OtherCol, 1 / 0 AS ThisDOESError
        FROM table1
    ) t
    ORDER BY ThisDOESError
) t
WHERE OtherCol = 1

Because the TOP needs to be calculated based on the ORDER BY ThisDOESError clause, the column ThisDOESError cannot be elided, and the filter on OtherCol cannot be pushed through.

Likewise this one cannot be optimized either

SELECT SomeCol
FROM (
    SELECT SomeCol, OtherCol,
        ROW_NUMBER() OVER (PARTITION BY SomeCol ORDER BY ThirdCol) AS rn
    FROM table1
) t
WHERE rn = 1 AND OtherCol = 1

In this case, because the row-number must be calculated on the whole set, the filter OtherCol = 1 cannot be safely pushed through.

Interestingly enough, this version should be able to be safely pushed through (no promises though!)

SELECT SomeCol
FROM (
    SELECT SomeCol, OtherCol,
        ROW_NUMBER() OVER (PARTITION BY SomeCol ORDER BY ThirdCol) AS rn
    FROM table1
) t
WHERE rn = 1 AND SomeCol = 'Something'

In this case, the optimizer should in theory be able to see that the filtering column is also the partitioning column, and therefore the row-number calculation will not change. This has been blogged about here .

These are just some examples of possible optimizations, there are many more. Not every possible optimization that you can think of has been implemented, so sometimes you need to help the compiler along. This can often be done by changing your view into an Inline Table Valued Function, where you can push parameters all the way to where you want them to be.

相关问题