SQL Server How to handle calculations done inside a join

93ze6v8z  于 2023-06-28  发布在  其他
关注(0)|答案(1)|浏览(176)

I have a join statement that looks like so:

... FROM
        table1 act
    JOIN
        table2 ti
    ON
        act.CD_AGENT = ti.CD_AGENT --
        AND 
        ti.DT_INTERVAL_START >= DATEADD(minute, (DATEDIFF(minute, 0, (act.[DT_START_TIME])) / @increment) * @increment, 0)
        AND
        ti.DT_INTERVAL_START < DATEADD(minute, IIF(ISNULL(act.[DT_END_TIME], @enddate_nullreplacement) > DATEADD(minute, CEILING(DATEDIFF(minute, 0, ISNULL(act.[DT_END_TIME], @enddate_nullreplacement)) / CAST(@increment AS float)) * @increment, 0), @increment, 0), DATEADD(minute, CEILING(DATEDIFF(minute, 0, ISNULL(act.[DT_END_TIME], @enddate_nullreplacement)) / CAST(@increment AS float)) * @increment, 0))
    WHERE ...

I suspect that those calculations inside the join have a negative impact on my performance. Do you see an elegant way to improve it? I considered having the calculations before in a temporary table or maybe add the results in an additional column. What do you think?

7gcisfzg

7gcisfzg1#

Here are some suggestions for simplifying your query. I don't think they will significantly change performance, but can make the code more readable. A few opinions on the expected execution plan follow.

If you find yourself dealing with complicated expressions that hinder readability, it is sometimes advantageous to break out the calculations into a CROSS APPLY . One of the uses for a cross apply is to encapsulate intermediate calculations whose results can be utilized later in the join sequence.

Something like:

... FROM
        table1 act
    CROSS APPLY (
        select
            longExpression AS FromDate,
            anotherLongExpression AS ToDate
    ) calc
    JOIN
        table2 ti
    ON
        act.CD_AGENT = ti.CD_AGENT
        AND 
        ti.DT_INTERVAL_START >= calc.FromDate
        AND
        ti.DT_INTERVAL_START < Calc.ToDate
    WHERE ....

Multiple CROSS APPLY s can be used in sequence if you wish you build up your calculations, particularly if you have repeated sub-expressions.

... FROM
        table1 act
    CROSS APPLY (
        SELECT
            ISNULL(act.[DT_END_TIME], @enddate_nullreplacement)
                AS EndDate1
    ) c1
    CROSS APPLY (
        SELECT
            DATEADD(minute, CEILING(DATEDIFF(minute, 0, c1.EndDate1) / CAST(@increment AS float)) * @increment, 0)
                AS EndDate2
    ) c2
    CROSS APPLY (
        select
            DATEADD(minute, (DATEDIFF(minute, 0, (act.[DT_START_TIME])) / @increment) * @increment, 0)
                AS FromDate,
            DATEADD(minute, IIF(c1.EndDate1 > c2.EndDate2, @increment, 0), c2.EndDate2)
                AS ToDate
    ) c3
    JOIN
        table2 ti
    ON
        act.CD_AGENT = ti.CD_AGENT
        AND 
        ti.DT_INTERVAL_START >= c3.FromDate
        AND
        ti.DT_INTERVAL_START < c3.ToDate
    WHERE ....

I believe the above is equivalent to your original query, but you should double check the work.

There might also be a case for breaking some of your date-rounding calculations out into a couple of user-defined functions. Something like:

CREATE FUNCTION DateTimeRoundMinutes(@dt DATETIME, @increment INT, @roundUp BIT)
RETURNS DATETIME
AS ...

As for query execution efficiency, I believe the most likely execution plan would:

  1. Select rows from table1 (preferably via an index that efficiently filters for conditions in the WHERE clause)
  2. Calculate a few values
  3. Use those values to perform a loop join into the Table2 (preferably via an index on Table2(CD_AGENT, DT_INTERVAL_START) ).

Both of the above table accesses would likely be executed as an index seek followed by a primary key lookup. Assuming that the necessary indexes are in place, I think the performance should be pretty good.

Even without restructuring your original query, I expect the execution plan would be similar. I suggest that you try both the original and restructured queries and review the execution plans. If comparable, go with what you judge to be more readable and maintainable.

(The above assumes an execution plan that accesses Table1 first and Table2 second. Given the join condition calculations, I believe there is little chance that the SQL Server could generating an efficient execution plan that accessed Table2 first.)

相关问题