Is there a better SQL Server query than this

cbjzeqam  于 2023-04-28  发布在  SQL Server
关注(0)|答案(2)|浏览(170)

I'm new to SQL and i was trying to make this query better since it's very slow.

I have a SQL Server table called Measurement2 with columns testPointKey(foreign key for parent table of type int auto-incrementing), dataType (dataType is of type TEXT), and dataValue (of type float). I'm trying to select data from this table based on multiple dataType values, and I'd like to have the results in a single row with each dataType value as a separate column. Here is create for the tables(note not the full schema):

CREATE TABLE TestPoint2 (
    recordKey           INTEGER     NOT NULL PRIMARY KEY,
    testSessionKey      INTEGER     NOT NULL REFERENCES TestSession2(recordKey) ON DELETE CASCADE,
    voltage             FLOAT,
    amps                FLOAT,
    phase               FLOAT,
    seconds             INTEGER,
    iterations          INTEGER,
);

CREATE TABLE Measurement2 (
    id                  INTEGER     NOT NULL PRIMARY KEY,
    testPointKey        INTEGER     NOT NULL REFERENCES TestPoint2(recordKey) ON DELETE CASCADE,
    mIndex              INTEGER,
    dataType            TEXT,
    dataValue           FLOAT,
);

Here's an example of the desired output:
| hp | tempHp | keithley | tempKeithley | va | vc | vavc | ia | ic | phaseVc | phaseIa | phaseIc | tempVaa | tempVb | tempVc | tempIa | tempIb | tempIc | tempCta | tempCtb | tempCtc | tempSom |
| ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ |
| 120 | 30 | 10 | 30 | 50 | 40 | 90 | 20 | 30 | 60 | 20 | 10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 | 110 |

I've tried using multiple instances of Measurement2 and joining them on testPointKey, like this:

This returns the correct output but it's very slow, and it would get worse the more different data types i have.

SELECT t1.dataValue AS hp, t2.dataValue AS tempHp, t3.dataValue AS keithley, t4.dataValue AS tempKeithley, t5.dataValue AS va, t6.dataValue AS vc, t7.dataValue AS vavc, t8.dataValue AS ia, t9.dataValue AS ic, t10.dataValue AS phaseVc, t11.dataValue AS phaseIa, t12.dataValue AS phaseIc, t13.dataValue AS tempVaa, t14.dataValue AS tempVb, t15.dataValue AS tempVc, t16.dataValue AS tempIa, t17.dataValue AS tempIb, t18.dataValue AS tempIc, t19.dataValue AS tempCta, t20.dataValue AS tempCtb, t21.dataValue AS tempCtc, t22.dataValue AS tempSom
FROM Measurement2 t1 
INNER JOIN Measurement2 t2 ON t1.testPointKey = t2.testPointKey 
INNER JOIN Measurement2 t3 ON t1.testPointKey = t3.testPointKey 
INNER JOIN Measurement2 t4 ON t1.testPointKey = t4.testPointKey 
INNER JOIN Measurement2 t5 ON t1.testPointKey = t5.testPointKey 
INNER JOIN Measurement2 t6 ON t1.testPointKey = t6.testPointKey 
INNER JOIN Measurement2 t7 ON t1.testPointKey = t7.testPointKey 
INNER JOIN Measurement2 t8 ON t1.testPointKey = t8.testPointKey 
INNER JOIN Measurement2 t9 ON t1.testPointKey = t9.testPointKey 
INNER JOIN Measurement2 t10 ON t1.testPointKey = t10.testPointKey 
INNER JOIN Measurement2 t11 ON t1.testPointKey = t11.testPointKey 
INNER JOIN Measurement2 t12 ON t1.testPointKey = t12.testPointKey 
INNER JOIN Measurement2 t13 ON t1.testPointKey = t13.testPointKey 
INNER JOIN Measurement2 t14 ON t1.testPointKey = t14.testPointKey 
INNER JOIN Measurement2 t15 ON t1.testPointKey = t15.testPointKey 
INNER JOIN Measurement2 t16 ON t1.testPointKey = t16.testPointKey 
INNER JOIN Measurement2 t17 ON t1.testPointKey = t17.testPointKey 
INNER JOIN Measurement2 t18 ON t1.testPointKey = t18.testPointKey 
INNER JOIN Measurement2 t19 ON t1.testPointKey = t19.testPointKey 
INNER JOIN Measurement2 t20 ON t1.testPointKey = t20.testPointKey 
INNER JOIN Measurement2 t21 ON t1.testPointKey = t21.testPointKey 
INNER JOIN Measurement2 t22 ON t1.testPointKey = t22.testPointKey 
INNER JOIN TestPoint2 tp1 ON tp1.recordKey = t22.testPointKey
WHERE CAST(t1.dataType as VARCHAR) = 'hp' 
  AND CAST(t2.dataType as VARCHAR) = 'tempHp'
  AND CAST(t3.dataType as VARCHAR) = 'keithley' 
  AND CAST(t4.dataType as VARCHAR) = 'tempKeithley' 
  AND CAST(t5.dataType as VARCHAR) = 'va' 
  AND CAST(t6.dataType as VARCHAR) = 'vc' 
  AND CAST(t7.dataType as VARCHAR) = 'vavc' 
  AND CAST(t8.dataType as VARCHAR) = 'ia' 
  AND CAST(t9.dataType as VARCHAR) = 'ic'
  AND CAST(t10.dataType as VARCHAR) = 'phaseVc'
  AND CAST(t11.dataType as VARCHAR) = 'phaseIa'
  AND CAST(t12.dataType as VARCHAR) = 'phaseIc'
  AND CAST(t13.dataType as VARCHAR) = 'tempVaa'
  AND CAST(t14.dataType as VARCHAR) = 'tempVb'
  AND CAST(t15.dataType as VARCHAR) = 'tempVc'
  AND CAST(t16.dataType as VARCHAR) = 'tempIa'
  AND CAST(t17.dataType as VARCHAR) = 'tempIb'
  AND CAST(t18.dataType as VARCHAR) = 'tempIc'
  AND CAST(t19.dataType as VARCHAR) = 'tempCta'
  AND CAST(t20.dataType as VARCHAR) = 'tempCtb'
  AND CAST(t21.dataType as VARCHAR) = 'tempCtc'
  AND CAST(t22.dataType as VARCHAR) = 'tempSom';
ubbxdtey

ubbxdtey1#

One thing I would do is move the conditional expressions from the WHERE clause to each live with their appropriate JOIN (and use better alias names). It won't do much if anything for performance, but it will make the query easier to read and maintain over time. Here's a shortened example:

SELECT m.dataValue AS hp, tHP.dataValue AS tempHp, keithly.dataValue AS keithley
FROM Measurement2 m
LEFT JOIN Measurement2 tHP ON tHP.testPointKey = m.testPointKey
    AND CAST(tHP.dataType as VARCHAR) = 'tempHp'
LEFT JOIN Measurement2 keithly ON keithley.testPointKey = m.testPointKey
    AND CAST(keithly.dataType as VARCHAR) = 'keithley'
WHERE CAST(m.dataType as VARCHAR) = 'hp'

Note the use of LEFT JOIN instead of INNER , as the latter would omit the entire row (all values) if a single item were accidentally missed. This only works when we also move the conditional expressions to live the JOIN.

Separate from that, I agree with the comments about the data type of, well, dataType . The need to use a CAST() here is profoundly awful for performance, as it not only needs to run this same cast repeatedly for all values in the table, but it also breaks any possibility of using an index on the column.

The other thing you can do is use a GROUP BY expression to roll these rows up per testPointKey . Then put a CASE expression in an aggregate function like SUM() for each desired item. This is called conditional aggregation. Here's a shortened example:

SELECT 
    SUM(CASE WHEN CAST(m.dataType as VARCHAR) = 'hp' THEN m.dataValue END) As Hp
   ,SUM(CASE WHEN CAST(m.dataType as VARCHAR) = 'tempHp' THEN m.dataValue END) As tempHp
   ,SUM(CASE WHEN CAST(m.dataType as VARCHAR) = 'keithly' THEN m.dataValue END) As keithly
   -- ...
FROM Measurement2 m
GROUP BY m.testPointKey

Again, fixingdataType to avoid the cast (because what you have now really is broken) would make a profound improvement in the performance.

Finally, this resembles what is known as an Entity-Attribute-Value (EAV) schema pattern. This pattern has its place, especially in the area of plugins and extensibility, but overall it's considered an anti-pattern: that is, something to avoid until you absolutely need it. You will do MUCH better if you can design the table so it more resembles the desired output in the first place.

xmd2e60i

xmd2e60i2#

Due to the helpful comments/answers I found a good alternative in a from of conditional aggregation. With some minor tweaks made (and some more needed such as changing the TEXT to VARCHAR to avoid casting), here is updated code:

WITH
    TestPoints AS
    (
        SELECT * FROM MyTestPoints WHERE sessionKey = '9' --custom number
    ),

    UnfilteredMeasurements AS
    (
        SELECT * FROM MyMeasurements WHERE pointKey IN (SELECT recordKey FROM TestPoints)
    ),

    FilteredMeasurements AS
    (
        SELECT
            pointKey,
            MAX(index) as index,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'hp'              THEN myValue ELSE NULL END) AS hp,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'tempHp'          THEN myValue ELSE NULL END) AS tempHp,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'va'              THEN myValue ELSE NULL END) AS va,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'vc'              THEN myValue ELSE NULL END) AS vc,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'vavc'            THEN myValue ELSE NULL END) AS vavc,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'ia'              THEN myValue ELSE NULL END) AS ia,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'ic'              THEN myValue ELSE NULL END) AS ic,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'phaseVc'         THEN myValue ELSE NULL END) AS phaseVc,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'phaseIa'         THEN myValue ELSE NULL END) AS phaseIa,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'phaseIc'         THEN myValue ELSE NULL END) AS phaseIc,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'tempVa'          THEN myValue ELSE NULL END) AS tempVa,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'tempVb'          THEN myValue ELSE NULL END) AS tempVb,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'tempVc'          THEN myValue ELSE NULL END) AS tempVc,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'tempIa'          THEN myValue ELSE NULL END) AS tempIa,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'tempIb'          THEN myValue ELSE NULL END) AS tempIb,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'tempIc'          THEN myValue ELSE NULL END) AS tempIc,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'tempCta'         THEN myValue ELSE NULL END) AS tempCta,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'tempCtb'         THEN myValue ELSE NULL END) AS tempCtb,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'tempCtc'         THEN myValue ELSE NULL END) AS tempCtc,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'tempSom'         THEN myValue ELSE NULL END) AS tempSom,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'myDate'          THEN myValue ELSE NULL END) AS myDate,
            MAX(CASE WHEN CAST(myType AS VARCHAR) = 'myTime'          THEN myValue ELSE NULL END) AS myTime
        FROM UnfilteredMeasurements
        GROUP BY pointKey
    ),

相关问题