Friday, October 27, 2006

IN Condition Tricks

Not your type, huh? Change for the better.
I needed to pull some data from our contact database (ichaos) based on ZIP code. The list of codes to use was fairly lengthy and I was concerned about the length of a possible WHERE clause since the data is stored as text to accommodate both the 5-digit and 9-digit US formats (not to mention postal codes from other countries).

Even though the column is stored as text I decided to use the IN() condition to provide the condition. This worked much better than I expected. Not only were the specific matches returned (we used the 5-digit variation in the conditional), but MySQL also returned any variations on the provided condition.

The following SQL statement:
SELECT … WHERE post_code IN (12345);
returns not only the records matching 12345 but also any that match on the first five characters (such as 12345-6789).

The reason this search appears to work appears to do with type conversion (see Type Conversion in Expression Evaluation). Since we are attempting to compare a string to a number the two arguments are converted to floating point numbers by default. To have better control over the comparison we could use the LEFT() function to truncate the zip code to 5 digits, then the CAST() function to control the conversion of the column used in the statement.


Your custom order
While performing my search I noticed another useful tip for the IN() condition in the comments of the SELECT syntax documentation page. If you use an IN() condition in the ORDER BY clause you can specify a custom ordering scheme for a column. So the following statement:
SELECT … ORDER BY %column_name% IN (%value1%, %value2%), %column_name%;
will order the column first by the values specified in the IN() condition followed by the default ordering scheme for the type of data stored in that column.


Update 2006/12/05:
After some further testing I've found that the custom ordering trick doesn't work as I thought. What it does is group the records matching the condition and sort those as a group. It does not sort the values according to the order specified in the parenthesis. If you want specific control of the ordering you'll have to do a separate IN() condition for each value, thus grouping all the records with that value.

What this means is that my previous example will place all records with %value1% and %value2% at the top, but the sorting of those values will be whatever the default is (the order entered in the database unless other sorting options have been specified).

To get the results I had been seeking you have to use something more like the following:
SELECT … ORDER BY %column_name% IN (%value1%), %column_name% IN (%value2%), %column_name%;