SQL Server Using GROUP BY to consolidate records from a query

2o7dmzc5  于 2024-01-05  发布在  其他
关注(0)|答案(1)|浏览(190)

The 'data' for this follows -- there may be more or less rows of the source-table:
| UID_MOTRIP | UID_PERSON | UID_INSPECTION |
| ------------ | ------------ | ------------ |
| 413 | 14 | 77 |
| 411 | 17 | 0 |
| 407 | 16 | 74 |

I need to have a result that evaluates from the above table TO what shows in the following table. The 'UID_INSPECTION' value of ZERO (for UID_MOTRIP=411) indicates that the inspection WAS NOT performed. The 'InspectionCount' is where the 'UID_INSPECTION' value is > 0, meaning the inspection WAS PERFORMED. The 'UID_MOTRIP' represents the 'maximum' UID_MOTRIP value. This 'result' shall always be 1-row.

UID_MOTRIPInspectionCountInspectionMissingCount
41321
sg3maiej

sg3maiej1#

This is gonna be three separate SELECT queries nested inside the main statement. The trick is using conditional aggregation inside two of the queries:

SELECT
     (SELECT MAX(UID_MOTRIP) FROM [table]) UID_MOTRIP
   , (SELECT COUNT(CASE WHEN UID_INSPECTION > 0 THEN 1 END) FROM [table]) InspectionCount
   , (SELECT COUNT(CASE WHEN UID_INSPECTION = 0 THEN 1 END) FROM [table]) InspectionMissingCount

The problem is this really is three separate passes through the data. We can do better.

If there's a column in the table that always has the same value, you can GROUP BY that column and then do normal aggregation. This seems like an odd thing to happen, but it's actually common in vendor products with lots of configuration, where the vendor doesn't know in advance which columns you might need and the ones you don't use are left at some default value.

SELECT
     MAX(UID_MOTRIP) UID_MOTRIP
   , COUNT(CASE WHEN UID_INSPECTION > 0 THEN 1 END) InspectionCount
   , COUNT(CASE WHEN UID_INSPECTION = 0 THEN 1 END) InspectionMissingCount
FROM [table]
GROUP BY [common_column]

In this way, the database will do all the aggregation as it moves through the data in a single pass.

If you don't have such a column, you can project one:

SELECT
     MAX(UID_MOTRIP) UID_MOTRIP
   , COUNT(CASE WHEN UID_INSPECTION > 0 THEN 1 END) InspectionCount
   , COUNT(CASE WHEN UID_INSPECTION = 0 THEN 1 END) InspectionMissingCount
FROM (
   SELECT 0 as GroupColumn, UID_MOTRIP, UID_INSPECTION
   FROM [table]
) t
GROUP BY GroupColumn

This is slightly less-efficient because of the projection, but still likely to be an improvement over the basic option.

But if the performance matters here, I'd still TEST this vs the original, because the query optimization engine can sometimes surprise you.

相关问题