SQL 2008+

SQL Server Filtered Indexes

by Richard Carr, published at http://www.blackwasp.co.uk/SQLFilteredIndexes.aspx

SQL Server allows data in selected columns from entire tables to be indexed, in order to improve the performance of some queries. SQL Server 2008 introduces the concept of filtered indexes. These index a limited set of the rows in a table.

Filtered Indexes

I described the basic use of standard indexes in the SQL Server Programming Fundamentals Tutorial. Simply put, an index holds the data from some columns in a table. The index is organised so that this data can be found more quickly than by simply scanning the entire table for a particular value. This means that queries that can use an index's contents are generally faster to execute than those than perform table scans. In a way, SQL Server indexes are analogous to indexes in a book. It's much quicker to find the topic you wish to read by looking it up using an index, rather than flicking through all of the book's pages. As with the book, there is an overhead in building an index, meaning that inserts and updates may be less performant than if the index was not present.

Before the release of SQL Server 2008, a table index would always include data from every row in the underlying table. If most queries only related to a specific subset of the information, the extra data in the index might never be used. In this case these unused index nodes would take up space unnecessarily. They would also slow down inserts and updates for rows that would not benefit from the index.

To address this, SQL Server 2008 introduced the concept of filtered indexes. A filtered index is constructed in the same manner as any other table index. However, the index includes a WHERE clause that determines which rows are included in the index and which are omitted. You can add a filtered index to a table for a limited subset of the rows, minimising the storage and update performance implications whilst still getting improvements in query speed. In fact, queries may perform better than with a full index, due to the smaller index size.

Another interesting use of filtered indexes is the ability to create unique filtered indexes. These allow you to specify that the data in a particular column, or combination of columns, must be unique within the items covered by the index only. For example, you might create a unique index for all non-NULL items in a column. In this scenario, all of the rows with values must be unique but you can insert as many NULLs as desired.

Depending upon the situation, you may wish to create multiple filtered indexes for a single table, each holding a different subset of the rows. You can have the indexes overlapping if this is useful. However, you cannot apply a filtered index to a view.

Creating a Filtered Index

To demonstrate the creation of a filtered index we need a sample table and some data. To create the table, run the following script against a new database. This creates a simple table for storing employee data. The table includes a self-referencing foreign key that allows you to find an employee's manager. The ManagerID column will hold an EmployeeID reference for subordinates and NULL for the highest level employees.

CREATE TABLE Employees
(
    EmployeeID INT PRIMARY KEY,
    Name VARCHAR(50),
    ManagerID INT NULL
)

ALTER TABLE Employees
ADD CONSTRAINT FK_Employee_Manager
FOREIGN KEY (ManagerID)
REFERENCES Employees (EmployeeID)

Let's add a few rows of sample data:

INSERT INTO Employees
    (EmployeeID, Name, ManagerID)
VALUES
    (1, 'Bob Boss', NULL),
    (2, 'Mel Manager', NULL),
    (3, 'Sam Black', 1),
    (4, 'Jim Brown', 1),
    (5, 'Tim Green', 2),
    (6, 'Sue White', 2),
    (7, 'Dan Blue', 6)

You create a filtered index using almost the same syntax as for a standard, non-clustered index. The only difference is that you add a WHERE clause that determines which rows are indexed. Any rows for which the WHERE predicate returns false are excluded from the index. In the sample case below, only top-level managers are indexed.

CREATE NONCLUSTERED INDEX IX_Managers
ON Employees (Name)
WHERE ManagerID IS NULL

This index can now be used by queries that limit the results to rows within the index. The following query is such an example. However, as the data set we have is so small, it's unlikely that SQL Server will use it in this case.

SELECT * FROM Employees WHERE ManagerID IS NULL AND Name = 'Mel Manager'

Next: Page 2

24 October 2013