Feb 16, 2010

The Other side of Filtered Index

 
I see many like the idea of filtered index. This cooooooooooool new feature allows an index to be created on a subset of the data. I have seen many presentations praising this feature.

Most of you who attended the session on “The other side of indexes”  at the last Tech Ed would have enjoyed the session well. Pinal Dave was giving a very good explanation on when to use indexes and when not to use them. He mentioned on Filtered index and explained how it will benefit certain type of queries.


If you have a large table (by rows) your index will be larger.  It adds complexity over the maintenance plans you need to build.  Additionally, statistics around larger indexes can get skewed easily. This may lead to sub-optimal execution plans chosen by SQL Server Query engine.

On of the key features on filtered index is the WHERE clause. The condition may not reference the key columns of the index For example, CREATE INDEX IndexName ON TableName (Column1) WHERE Column2 >100 is a valid statement. This allows a high level of logical partitioning of data. You can create an index for column1 where Column2 is mentioned in the definition but not in the storage.

I am inspired by the feature. Wow filtered index can make wonders.  It will ease some pain points from DBAs. Suddenly a question came into my mind:  Can I create multiple filtered indexes on the entire table by
"Partitioning" a key column? For example IF my key column is of datetime data type, can I create 12 filtered indexes one for January and other for February and so on up to December?

SQL Server allowed such indexes (unless you exceed the limit on the number of indexes you can create).  Do they improve performance?
I did some research on that point and hence the post.  Please note that my research is still going on and the final results may differ from this. (Interim reports are famous in political circles.  Why shouldn't we use it?)
 
  • Make sure you have smaller range for the filtered index.  When the filtered index covers a larger percentage of the data, it performs relatively bad. 

  • Filtered indexes are good only when you have queries falling within that range. If you have two filtered indexes on a column and if your query needs data from both,  most probably SQL Server will ignore both unless you have written the query combining two statements.

  • Even if you combine two queries to get the benefit of multiple filtered indexes,  they may still under perform than a table which has single non clustered index.

  • Be aware of using filtered indexes against string columns. If LIKE operator is used in your query, there is a possibility that SQL Server will not use the filtered index.
 I am planning to write a detailed article (or a bigger post soon on this) when all my tests are completed.  Please stay tuned.

1 comment:

DK said...

One issue with Qry optimizer using Filtered Index is that the query needs to explicitly reference the filtered index value. For eg:

CREATE INDEX IndexName ON TableName (Column1) WHERE Column2 >100

foll. query will most probably not be able to use the index:
declare @filter int = 150
select *
from TableName
where Column2 = @filter

but the foll. query will be able to use it:
select *
from TableName
where Column2 = 150

Post a Comment