sql server存储过程在进行小更改后需要很长时间才能运行

我有一个存储过程，它从我创建的两个表中获取信息，生成一个摘要表，然后与几个视图一起使用。
以前，这需要60-90秒来运行。我有两个调用不同成本的函数，第三个调用另一个调用cost*qty。我删除了所有3个函数，并替换为一个新函数，它几乎是其他代价函数的精确副本
我写这个是因为我正在努力，所以它有点演变。我在速度上有所提高，但仍然没有以前快，我不知道为什么。

ALTER FUNCTION [dbo].[fn_getFactoryStdCost]
    (@PartID int)
RETURNS decimal(20, 4)
AS
BEGIN
    DECLARE @pureID int = 0

    SET @pureID = (SELECT TOP(1) PURE_COST_ID 
                   FROM visuser.PART_COST 
                   WHERE EN_PART_ID = @partID 
                   ORDER BY EN_REV_MASTER_ID DESC, IC_WAREHOUSE_ID DESC)

    RETURN (SELECT TOP(1) (TOT_MATERIAL_N + TOT_MATERIAL_OVERHEAD_N) 
            FROM visuser.PURE_COST 
            WHERE PURE_COST_ID = @pureID 
            ORDER BY (TOT_MATERIAL_N + TOT_MATERIAL_OVERHEAD_N) DESC) 
END

替换为。我添加了 WITH INLINE = OFF 在它第一次被困在排除这个可能性之后。这个函数本身工作得很好。

ALTER FUNCTION [dbo].[fn_getFactoryStdCost] 
    (@PartID int)
RETURNS decimal(20,4)
WITH INLINE = OFF
AS
BEGIN
    DECLARE @pureID int = 0

    SET @pureID = (SELECT TOP(1) PURE_COST_ID 
                   FROM visuser.PART_COST 
                   WHERE EN_PART_ID = @partID 
                   ORDER BY EN_REV_MASTER_ID DESC, IC_WAREHOUSE_ID DESC)

    RETURN (SELECT TOP(1) (TOT_MATERIAL_N + TOT_MATERIAL_OVERHEAD_N + TOT_RUN_VALUE_N + TOT_FIXED_OVERHEAD_N) FROM visuser.PURE_COST WHERE PURE_COST_ID = @pureID ORDER BY (TOT_MATERIAL_N + TOT_MATERIAL_OVERHEAD_N) DESC) 
END

我做的其他改变是 [Qty] > 0 AND 到[part count]行，并将commodity id的基于字符串的条目替换为int（这更合适），因为commodity\u id是对commodity\u代码的引用，而commodity\u代码就是字符串。
我希望它跑得更快，而不是无限期地跑。这个过程现在需要很长时间才能运行。我现在已经38分钟了。我还试着在过程本身中复制代码并运行它，这也会花费很长时间，所以它是代码本身的一部分。
allpartslist表有1.04m行，bombreakdown表也有。bombreakdown表要复杂得多，需要40-60秒才能生成。bomsummary表将有4100行。allpartslist表有适当的索引，bombreakdown没有。

ALTER PROCEDURE [dbo].[createBOMSummary]
AS

    DECLARE @processTime int=0, @begin datetime, @end datetime

    SET @begin = SYSDATETIME()

    IF OBJECT_ID(N'dbo.bomSummary', N'U') IS NOT NULL
        DROP TABLE bomSummary

    SELECT 
        DISTINCT ap.[SourcePartID] AS [Assembly Part ID],
        p.[PART_X] AS [Assembly Part #],
        p.[DESCR_X] AS [Assembly Part Description],

        (SELECT COUNT(DISTINCT [Component Part #]) FROM [bomBreakdown] WHERE [Qty] > 0 AND [Component Part ID] IS NOT NULL AND SourcePartID = ap.SourcePartID GROUP BY [SourcePartID]) AS [Part Count],
        (SELECT SUM([Qty]) FROM [bomBreakdown] WHERE [Component Part ID] IS NOT NULL AND SourcePartID = ap.[SourcePartID] GROUP BY [SourcePartID]) AS [Total # of Parts],
        ([dbo].[fn_getFactoryStdCost](ap.[SourcePartID])) AS [Factory Std Cost],

        COALESCE(
            (SELECT COUNT(DISTINCT ComponentPartID) 
              FROM AllPartsList apl
                LEFT JOIN visuser.EN_PART p1
                  ON p1.[EN_Part_ID] = apl.[ComponentPartID]
              WHERE 
                apl.ComponentPartID IS NOT NULL AND 
                apl.SourcePartID = ap.SourcePartID  AND
                p1.Commodity_ID IN (15, 84, 85, 87, 81, 92) -- Commodity Codes: 009, 072, 073, 075, 079, 082
              GROUP BY SourcePartID
            ), 0) AS [# of Docs], --0sec

        COALESCE(
        (SELECT COUNT(DISTINCT ComponentPartID) 
        FROM AllPartsList apl
            LEFT JOIN visuser.EN_PART p1
                ON p1.[EN_Part_ID] = apl.[ComponentPartID]
        WHERE 
            apl.ComponentPartID IS NOT NULL AND 
            apl.SourcePartID = ap.SourcePartID  AND
            p1.Commodity_ID IN (28)  -- Commodity Code 034
        GROUP BY SourcePartID
        ), 0) AS [# of Software], --0sec

        COALESCE(
        (SELECT COUNT(*) 
        FROM visuser.[PART_COST] 
        WHERE [STD_PO_Cost_N] > 0 AND 
            EN_PART_ID IN 
            (SELECT DISTINCT ComponentPartID FROM AllPartsList WHERE ComponentPartID IS NOT NULL AND SourcePartID = ap.SourcePartID)
        ), 0) AS [# of Std Cost Items], --0sec

        COALESCE(
        (SELECT COUNT(DISTINCT ComponentPartID) 
        FROM AllPartsList apl
            LEFT JOIN visuser.EN_PART p1
                ON p1.[EN_Part_ID] = apl.[ComponentPartID]
        WHERE 
            apl.ComponentPartID IS NOT NULL AND 
            apl.SourcePartID = ap.SourcePartID  AND
            p1.Commodity_ID IN (11)  -- Commodity Code: 002
        GROUP BY SourcePartID), 0
        ) AS [# of HR Devices] ,--0sec

        COALESCE(
        (SELECT COUNT(DISTINCT ComponentPartID) 
        FROM AllPartsList apl
            LEFT JOIN visuser.EN_PART p1
                ON p1.[EN_Part_ID] = apl.[ComponentPartID]
        WHERE 
            apl.ComponentPartID IS NOT NULL AND 
            apl.SourcePartID = ap.SourcePartID  AND
            p1.Commodity_ID IN (5)  -- Commodity Code: 007
        GROUP BY SourcePartID), 0
        ) AS [# of 3rd Party Devices], --0sec

        COALESCE(
        (SELECT COUNT(DISTINCT ComponentPartID) 
        FROM AllPartsList apl
            LEFT JOIN visuser.EN_PART p1
                ON p1.[EN_Part_ID] = apl.[ComponentPartID]
        WHERE 
            apl.ComponentPartID IS NOT NULL AND 
            apl.SourcePartID = ap.SourcePartID  AND
            p1.Commodity_ID IN (13) AND  -- Commodity Code: 005
            p1.MAKE_BUY_C = 'B'
        GROUP BY SourcePartID
        ), 0) AS [# of Robots], --0sec

        COALESCE(
        (SELECT COUNT(*) 
        FROM visuser.[PART_COST] c
            LEFT JOIN visuser.[EN_PART] p
            ON p.[EN_PART_ID] = c.[EN_PART_ID]
        WHERE 
            c.[STD_PO_Cost_N] > 0 AND 
            p.[MAKE_BUY_C] = 'B' AND
            c.[EN_PART_ID] IN 
               (SELECT DISTINCT ComponentPartID FROM AllPartsList WHERE ComponentPartID IS NOT NULL AND SourcePartID = ap.SourcePartID)
        ), 0) AS [# of Buy Parts], --0sec

        COALESCE(
        (SELECT COUNT(*) 
        FROM visuser.[PART_COST] c
            LEFT JOIN visuser.[EN_PART] p
            ON p.[EN_PART_ID] = c.[EN_PART_ID]
        WHERE 
            c.[STD_PO_Cost_N] > 0 AND 
            p.[MAKE_BUY_C] = 'M' AND
            c.[EN_PART_ID] IN 
            (SELECT DISTINCT ComponentPartID FROM AllPartsList WHERE ComponentPartID IS NOT NULL AND SourcePartID = ap.SourcePartID)
        ), 0) AS [# of Make Parts]  

    INTO bomSummary
    FROM AllPartsList ap
      LEFT JOIN visuser.EN_PART p
        ON p.[EN_Part_ID] = ap.[SourcePartID]
    ORDER BY [PART_X]

    SET @end = SYSDATETIME()
    SET @processTime = DATEDIFF(s, @begin, @end)

    PRINT @end
    PRINT CHAR(10)+CHAR(13)
    PRINT 'bomSummary Processing Time: ' + CONVERT(varchar, @processTime)

GO

以下是bombreakdown表的外观：

以及allpartslist表：

如果我注解掉函数行，两条记录需要1米20秒来处理，下面是执行计划的一部分。看起来每次合并都会增加4-6秒的处理时间。

如果我删除所有的合并，那么需要2分50秒来处理所有4981条记录。下面是它的执行列表：

执行计划建议了几个额外的索引，所以我添加了这些索引，现在1条记录需要0秒，2条记录需要5秒，10条记录需要1秒，100条记录需要2秒，1000条记录需要28秒，所有4981条记录需要4分17秒。额外的索引当然有帮助，我不再看到%s超过1000%，有几个仍然超过100%，这使我认为有一些更多的优化，可以做的，我只是不知道在哪里。执行计划是巨大的，所以这里只有几张照片：

不知道这两张唱片是怎么回事。这已经不是以前的90秒了，但至少现在结束了。
奇怪的是，它有（1000行受影响），然后（1行受影响）。我不知道那一排是什么，也不知道它从哪里来。我还是很想知道为什么做了这么少的改变会有这么大的不同。
我正在使用：
sql server 2019（v15.0.2070.41）
ssms版本18.5
以下是我根据allmhuran的建议修改的结果：

SELECT
    DISTINCT ap.[SourcePartID] AS [Assembly Part ID],
    p.[PART_X] AS [Assembly Part #],
    p.[DESCR_X] AS [Assembly Part Description],
    oa2.[Part Count],
    oa2.[Total # of Parts],
    ([dbo].[fn_getFactoryStdCost](ap.[SourcePartID])) AS [Factory Std Cost],
    oa2.[# of Docs],
    oa2.[# of Software],
    'Logic Pending' AS [# of Std Cost Items],
    oa2.[# of HR Devices],
    oa2.[# of 3rd Party Devices],
    oa2.[# of Robots],
    oa2.[# of Buy Parts],
    oa2.[# of Make Parts]

  FROM AllPartsList ap
    LEFT JOIN visuser.EN_PART p
      ON p.[EN_Part_ID] = ap.[SourcePartID]
  OUTER APPLY (
        SELECT
            [Part Count]                = COUNT(    DISTINCT IIF( [Qty] = 0, null, [Component Part #])  ),  
            [Total # of Parts]          = SUM([Qty]),
            [# of Docs]                 = COUNT(    DISTINCT IIF( [Commodity Code] IN ('009', '072', '073', '075', '079', '082'), [Component Part #], null) ), -- Commodity Codes: 009, 072, 073, 075, 079, 082  :  Commodity ID: 15, 84, 85, 87, 81, 92
            [# of Software]             = COUNT(    DISTINCT IIF( [Commodity Code] IN ('034'), [Component Part #], null)    ), -- Commodity Code 034  :  Commodity ID: 28
            [# of HR Devices]           = COUNT(    DISTINCT IIF( [Commodity Code] IN ('002'), [Component Part #], null)    ), -- Commodity Code 002  :  Commodity ID: 11
            [# of 3rd Party Devices]    = COUNT(    DISTINCT IIF( [Commodity Code] IN ('007'), [Component Part #], null)    ), -- Commodity Code 007  :  Commodity ID: 5
            [# of Robots]               = COUNT(    DISTINCT IIF( ( [Commodity Code] IN ('005') AND [Make/Buy] = 'B' ), [Component Part #], null)   ), -- Commodity Code 005  :  Commodity ID: 13
            [# of Buy Parts]            = COUNT(    DISTINCT IIF( [Make/Buy] = 'B', [Component Part #], null)   ),
            [# of Make Parts]           = COUNT(    DISTINCT IIF( [Make/Buy] = 'M', [Component Part #], null)   )

          FROM bomBreakdown
          WHERE
            [Component Part ID] IS NOT NULL AND 
            [SourcePartID] = ap.[SourcePartID] AND
            --[SourcePartID] = ap.[AssemblyPartID] AND
            ap.SourcePartID = 964
          GROUP BY [SourcePartID]
    ) oa2

好吧，抽点时间来看看。
标量函数重构
正如我在评论中提到的，标量函数对基于集合的操作不好。一般来说，如果你有

create function scalar_UDF(@i int) returns int as begin
   return @i * 2;
end

select    c = scalar_UDF(t.c)
from      t;

然后，这将通过覆盖下的痛苦行（rbar）操作将select变成一行。
您可以通过坚持使用基于集合的操作来提高性能。一种方法是将标量udf标记为 inline ，它基本上告诉sql它可以在生成查询计划之前将您的查询重写为：

select    c = t.c * 2
from      t;

但是标量函数内联对于微软来说是一件很难解决的事情，而且仍然有一些缺陷。另一种方法是使用内联表值函数和 cross apply 或者 outer apply ```
create function inline_TVF(@i int) returns table as return
(
select result = @i * 2
)

select c = u.result
from t
outer apply inline_TVF(t.c) u;

实际因子分解重构
现有程序的一部分如下所示：

select [Part Count] =
(
select count(distinct [Component Part #])
from bomBreakdown
where Qty > 0
and [Component Part ID] is not null
and SourcePartID = ap.SourcePartID
group by SourcePartID
),
[Total # of Parts] =
(
select sum(Qty)
from bomBreakdown
where [Component Part ID] is not null
and SourcePartID = ap.SourcePartID
group by SourcePartID
)
-- , more ...

这两个子查询看起来非常相似。这是一种模式：

select a = (
select x1 from y where z
),
b = (
select x2 from y where almost_z
)

我们真正想做的是如下的事情。如果可以的话，那么查询只需要点击 `y` 一次，而不是两次。当然，语法是无效的：

select a = t.x1,
b = t.x2
from (
select x1 where z,
x2 where almost_z
from y
) t

啊哈，但也许我们可以聪明一点。如果我们回顾一下你的具体案例，我们可能会把它变成这样：

select oa1.[Part Count],
oa1.[Total # of Parts]
into bomSummary
from AllPartsList ap
left join visuser.EN_PART p on p.EN_Part_ID = ap.SourcePartID
outer apply (
select [Part Count] = count
(
distinct iif
(
Qty = 0, null, [Component Part #]
)
),
[Total # of Parts] = sum(qty)
from bomBreakdown
where [Component Part ID] is not null
and SourcePartID = ap.SourcePartID
group by SourcePartID
)
oa1

这里，那个 `iif(Qty = 0, null, [Component Part #])` 如果数量为零，将使列为空。count将忽略这些空值。我们得到了清晰的，就像以前一样。所以我们偷偷地找到了一个 `where` 这里的子句是：“数一数 `component part #` 数量不等于零的值”。现在我们可以求和了 `Qty` 列，我们已经完成了重构。
同样的重构可以在这个存储过程的许多地方进行。这实际上是重构sql的一个很好的学习练习。我不打算做所有这些，只是试着识别模式，并遵循一个分解过程-就像你在代数中做的那样。因为，在很多方面，这就是代数！
请原谅任何拼写错误/语法错误。我还不能通过一个实际的查询窗口来检查这一点，我在这里的目的是演示一些想法，而不是实际重写原始查询。

sql server存储过程在进行小更改后需要很长时间才能运行

1条答案

相关问题

热门标签

最新问答