MySQL CAST() causes significant performance hit

qoefvg9y  于 2022-12-22  发布在  Mysql
关注(0)|答案(1)|浏览(112)

So I ran the following in the MySQL console as a control test to see what was holding back the speed of my query.

SELECT bbva_deductions.ded_code, SUBSTRING_INDEX(bbva_deductions.employee_id, '-' , -1) AS tt_emplid, 
                bbva_job.paygroup, bbva_job.file_nbr, bbva_deductions.ded_amount 
                FROM bbva_deductions 
                LEFT JOIN bbva_job 
                ON CAST(SUBSTRING_INDEX(bbva_deductions.employee_id, '-' , -1) AS UNSIGNED) = bbva_job.emplid LIMIT 500

It took consistently around 4 seconds to run. (seems very high for only 500 rows). Simply removing the CAST within the JOIN decreased that to just 0.01 seconds.
In this context, why on earth is CAST so slow?
Here is the output of an EXPLAIN for this query:

And the same for the query without a CAST :

EXPLAIN EXTENDED :

ftf50wuq

ftf50wuq1#

As documented under How MySQL Uses Indexes :
MySQL uses indexes for these operations:

[ ***deletia*** ]
  • To retrieve rows from other tables when performing joins. MySQL can use indexes on columns more efficiently if they are declared as the same type and size. In this context, VARCHAR and CHAR are considered the same if they are declared as the same size. For example, VARCHAR(10) and CHAR(10) are the same size, but VARCHAR(10) and CHAR(15) are not.

Comparison of dissimilar columns may prevent use of indexes if values cannot be compared directly without conversion. Suppose that a numeric column is compared to a string column. For a given value such as 1 in the numeric column, it might compare equal to any number of values in the string column such as '1' , ' 1' , '00001' , or '01.e1' . This rules out use of any indexes for the string column.
In your case, you are attempting to join on a comparison between a substring (of a string column in one table) and a string column in another table. An index can be used for this operation, however the comparison is performed lexicographically (i.e. treating the operands as strings, even if they represent numbers).
By explicitly casting one side to an integer, the comparison is performed numerically (as desired) - but this requires MySQL to implicitly convert the type of the string column and therefore it is unable to use that column's index.
You have hit this road bump because your schema is poorly designed. You should strive to ensure that all columns:

  1. are encoded using the data types that are most relevant to their content; and
  2. contain only a single piece of information — see Is storing a delimited list in a database column really that bad?
    At the very least, your bbva_job.emplid should be an integer; and your bbva_deductions.employee_id should be split so that its parts are stored in separate (appropriately-typed) columns. With appropriate indexes, your query will then be considerably more performant.

相关问题