MySQL查询性能,使用巨大的表来生成给定日期的报告,其中包含累积计数和计数

pn9klfpd  于 2024-01-05  发布在  Mysql
关注(0)|答案(1)|浏览(113)

我使用MySQL,我有一个详细的表,其中包括200万条记录,在我的本地和以下查询需要大约2分钟,这200万条记录生成输出,但在生产上,我有80万条记录在该表中的任何一天。
我已经在列StatusDateTimeProductionFacilityProductionStatusNo上有了索引。
即使我在表中有分区的基础上ProductionStatusNo的值,如0,1,2等,直到12.

CREATE DEFINER=`root`@`localhost` PROCEDURE `a_dashboard_count`(
IN p_StatusDate Date, 
IN p_UnassignedProductionFaciltiy nvarchar(500)
)
BEGIN

     DECLARE PD_S_C INT;
     SELECT COALESCE(cumulative_count, 0) INTO PD_S_C  FROM (   
     SELECT 0 AS dummy_value 
        ) dummy
LEFT JOIN dailysummary ON  ProductionStatusNo = 1     AND StatusDateTime = DATE_SUB(p_StatusDate, INTERVAL 1 DAY);
    

INSERT INTO dailysummary(ProductionStatus ,ProductionStatusNo,StatusDateTime ,count , cumulative_count ) 

SELECT 'Unassigned' AS ProductionStatus,
        0 AS ProductionStatusNo, 
        p_StatusDate AS StatusDate,
       COUNT(DISTINCT d.UniqueFormId) as DayCount,
       COUNT(DISTINCT d.UniqueFormId) as CumulativeCount
FROM detail d
LEFT JOIN (
    SELECT DISTINCT UniqueFormId
    FROM detail exd
    WHERE exd.StatusDate <= p_StatusDate AND exd.ProductionStatusNo != 1
) exd ON d.UniqueFormId = exd.UniqueFormId
WHERE d.ProductionFacility = p_UnassignedProductionFaciltiy
  AND d.StatusDate <= p_StatusDate
  AND exd.UniqueFormId IS NULL
  
union all

 SELECT
    ps.Status,
    ps.Id,
    p_StatusDate,
    COALESCE(totalcount, 0) AS count,
    COALESCE(totalcount, 0) + PD_S_C AS cumulative_count
FROM
    productionstatus AS ps
LEFT JOIN (
    SELECT
        COUNT(DISTINCT d.UniqueFormId) AS totalcount,
        p_StatusDate AS StatusDate,
        d.ProductionStatusNo
    FROM
        detail AS d
    LEFT JOIN
        detail AS exd ON d.UniqueFormId = exd.UniqueFormId
                       AND exd.StatusDate < p_StatusDate
                       AND exd.ProductionStatusNo = d.ProductionStatusNo
    WHERE
        d.StatusDate = p_StatusDate
        AND exd.UniqueFormId IS NULL
    GROUP BY
        d.ProductionStatusNo
) AS d ON ps.Id = d.ProductionStatusNo
WHERE
    ps.Id = 1


        UNION all

        SELECT     ps.Status AS ProductionStatus,    
        ps.Id AS ProductionStatusNo,  
        p_StatusDate AS StatusDate,   
        COALESCE(c, 0)  AS  DayCount,  
        COALESCE(c, 0) AS CumulativeCount 
        FROM  productionstatus as ps
        LEFT JOIN (    SELECT         COUNT(*) as c,        ed.psn 
        FROM (
        SELECT
            UniqueFormId,MAX(productionStatusNo) as psn  FROM   detail
        WHERE   StatusDate <= p_StatusDate
        GROUP BY UniqueFormId
            ) as ed
        GROUP BY ed.psn
        ) as l ON ps.Id = l.psn
        WHERE
            ps.Id not in ( 0,1);
END

字符串
这里是带索引的表模式

CREATE TABLE `detail` (
  `Id` int NOT NULL AUTO_INCREMENT,
  
 
  `EINNo` varchar(45) NOT NULL,
  `EmployeeNo` varchar(45) NOT NULL,
  `Form` varchar(45) NOT NULL,
  
  `ProductionStatusNo` int NOT NULL,
  
  
  `UniqueFormId` varchar(450) NOT NULL,
  `ProductionFacility` varchar(450) NOT NULL,
  
  `StatusDate` date DEFAULT NULL,
  PRIMARY KEY (`Id`,`ProductionStatusNo`),
  KEY `idx_detail_EINNo` (`EINNo`),
  KEY `idx_detail_EmployeeNo` (`EmployeeNo`),
  KEY `idx_detail_Form` (`Form`),
  KEY `idx_detail_UniqueFormId` (`UniqueFormId`),
  KEY `idx_detail_ProductionFacility` (`ProductionFacility`),
  KEY `idx_detail_ProductionStatusNo` (`ProductionStatusNo`),
  KEY `idx_detail_StatusDate` (`StatusDate`)
) ENGINE=InnoDB AUTO_INCREMENT=11652798 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
/*!50100 PARTITION BY RANGE (`ProductionStatusNo`)
(PARTITION p0 VALUES LESS THAN (1) ENGINE = InnoDB,
 PARTITION p1 VALUES LESS THAN (2) ENGINE = InnoDB,
 PARTITION p2 VALUES LESS THAN (3) ENGINE = InnoDB,
 PARTITION p3 VALUES LESS THAN (4) ENGINE = InnoDB,
 PARTITION p4 VALUES LESS THAN (5) ENGINE = InnoDB,
 PARTITION p5 VALUES LESS THAN (6) ENGINE = InnoDB,
 PARTITION p6 VALUES LESS THAN (7) ENGINE = InnoDB,
 PARTITION p7 VALUES LESS THAN (8) ENGINE = InnoDB,
 PARTITION p8 VALUES LESS THAN (9) ENGINE = InnoDB,
 PARTITION p9 VALUES LESS THAN (10) ENGINE = InnoDB,
 PARTITION p10 VALUES LESS THAN (11) ENGINE = InnoDB,
 PARTITION p11 VALUES LESS THAN (12) ENGINE = InnoDB,
 PARTITION p12 VALUES LESS THAN (13) ENGINE = InnoDB,
 PARTITION p13 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */;

ymzxtsji

ymzxtsji1#

  • 构建并维护一个包含每日小计的Summary Table,而不是费力地浏览事实表。

是否需要包括当天的计数?(有两种方法可以处理此问题。)

  • Partitioning可能是无用的,除非你打算删除“旧”数据。那么它只会帮助大删除(通过使用DROP PARTITION而不是大DELETE)。
  • 冗余:
KEY `idx_ProductionStatus` (`ProductionStatus`),  // DROP this
KEY `IX_detail_Covering`   (`ProductionStatus`,`StatusDateTime`,`Id`),

字符串

  • 为什么要选择两次:
COUNT(DISTINCT UniqueFormId )

  • 请用它来自的表格来限定每一列。我在解释这一点时遇到了麻烦:
WHERE  DATE(statusdatetime) <= p_StatusDateTime


另外,DATE(col)不是sargable。也许我们可以修正它。
没有化名我没法分析索引

  • 您正在使用LEFT JOIN,但没有检查NULL。也许普通的JOIN更合适?
  • 我认为在这种情况下不需要COALESCE
COALESCE(COUNT(*), 0) as c

  • NOT IN ( SELECT ... )通常效率很低。
NOT EXISTS ( SELECT 1 ... )


LEFT JOIN ... WHERE ... IS NULL

  • AND Good grief;将日期时间值存储在DATETIMETIMESTAMP中,NOTVARCHAR
  • 在某些情况下,临时表有时可以通过使用CTE(在8.0或10.2中)来避免。
  • 这些索引 * 可能 * 有助于性能:
exd:  INDEX(StatusDate, ProductionStatusNo, UniqueFormId)
d:    INDEX(StatusDate, UniqueFormId, ProductionStatusNo)
d:    INDEX(ProductionFacility, StatusDate, UniqueFormId)


并DROP匹配左边部分的任何索引,例如KEY idx_detail_StatusDate (StatusDate)

相关问题