Query tuning: The IN clause

Update (2019/02/08): Originally the index definition on this post was on Users(DisplayName) INCLUDE Age. This was preventing an Index Seek because I wanted to show an Index Scan. I didn’t mention it in the original, but it’s clear to me that I made a mistake. Thank you to Bryan and John in the comments for pointing this out. Given that information I have re-written this post.

This post will cover the IN clause and another way to rewrite the same logic. I don’t intend to say that IN is better or worse than other T-SQL, I just want to showcase what it does.

We’ll jump right into the demos. I’m using StackOverflow2010 for this, and I’m going to create an index to be used by this query.

CREATE NONCLUSTERED INDEX ix_Age_DisplayName on Users (Age)
 INCLUDE (DisplayName)
SELECT DisplayName
FROM Users as u
WHERE Age IN (96,97,98,99,100)

Okay, cool. Let’s take a look at the Index Seek.

I want to focus on the bottom part. SQL Server decided to split our query into a series of Seek Keys.

This has interesting implications for how our query was optimized. Let’s take a look at another way of writing the same query.

SELECT DisplayName
FROM Users as u
JOIN (SELECT 96 as Age UNION ALL
SELECT 97 as Age  UNION ALL
SELECT 98 as Age  UNION ALL
SELECT 99 as Age  UNION ALL
SELECT 100 as Age ) as A1 on A1.Age = u.Age

This query will get the same result set. This time, let’s run both queries in the same batch. Ignore the Query Cost, just look at the different execution plans.

This is what query tuning is all about. Rewriting a query in two different ways to see if the optimizer will pick a different execution plan.

Comparing the IN clause vs the UNION ALL

There’s two significant differences in these plans. Let’s take a look at the execution plans, using the execution plan comparison feature in Management Studio.

Click on the whole comparison if you’re curious. I’ll zoom in on the differences below. The first plan using the IN clause is on the left. The second plan using the UNION ALL is on the right.

One thing to note is the Optimization level is different. The first query using the IN statement only received trivial optimization while the UNION query received full.

Taking a look at how the predicates were applied

I showed how the Seek Keys were applied above, but here’s the predicates from the Index Seek when using the UNION ALL approach:

And of course, the number of rows read:

Moral of the post

The point of this post is that query tuning can be as simple as re-writing the IN clause. I don’t want to say that either way is better, since it depends on your environment.

Don’t rewrite code unless there’s already a performance issue, and if you do, test it thoroughly.


3 thoughts on “Query tuning: The IN clause”

  1. Hi, Arthur. I’m curious what the performance characteristics for the 2 queries are if you have a more optimal nonclustered index on Users(Age) INCLUDE (DisplayName).

    1. Hey Bryan! I went back with this feedback and re-wrote the entire post. Thank you for the thoughts on this.

      The original Index Scan showed a memory grant which was nice, but overall my index was really sub-optimal.

Leave a Reply

Your email address will not be published. Required fields are marked *

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close