Introduction To SQL Server Indexing

Mar 2, 2023

Updated on August 9, 2020.

What Are Indexes In SQL Server

Indexes in SQL Server are database objects that help improve the performance of queries by providing quick access to data in a table.

An index is essentially a data structure that maps the values in one or more columns of a table to their physical locations on disk. When a query is executed that includes the indexed columns in its search criteria, the index can be used to locate the rows that match the criteria much more quickly than if the database had to scan the entire table.

Indexes can be created on one or more columns in a a table or view, and they can be either clustered on two or more columns or non-clustered. A clustered index determines the physical order of the data in a table, while a non-clustered index is a separate structure that points to the location of the data.

It's important to note that indexes can also have an impact on the performance of INSERT, UPDATE, and DELETE operations, as these operations may need to update the index in addition to the table data. Therefore, it's important to carefully consider the columns to include in an index and the type of index to use based on the specific needs of the database and its queries.

A SQL Server Index Consists Of The Following Components:

Index Key:

The index key is the column or columns that are used to retrieve data from sql database and create the index. It defines the order in the sql database in which data is stored in the index and is used to locate the data.

Leaf Nodes:

The leaf nodes of table columns in the index contain the actual data values and the pointers to the corresponding rows in the table.

Non-Leaf Nodes:

Non-leaf nodes of the index contain pointers to other nodes in the index. These nodes are used to navigate the index to find the data.

Root Node:

The root node is the top-level node of the index. It contains pointers to the other nodes in the index and is used to start the search for data.

Pages:

The index is stored on disk in a series of pages. Each page contains a portion of the index data and a pointer to the next page in the same index name.

Index Fragmentation:

Index fragmentation occurs when the data in the index is not stored in contiguous pages, which can cause slower search times.

Fill Factor:

The fill factor is a setting that determines how much free storage space is left on each page of the index. This setting can affect performance by controlling how often the index needs to be reorganized or rebuilt to remove fragmentation.

Statistics:

Statistics are used by SQL Server's query optimizer to estimate the number of rows that will be returned by a query. They are created automatically when an index is created and can be manually updated.

Filtered Indexes:

Filtered indexes are a subset of a table's data that meets certain criteria. They can be created to improve query performance for specific queries.

In summary, a SQL Server index consists of several components including the index key, leaf nodes, non-leaf nodes, root node, pages, and settings such as index fragmentation and fill factor. Understanding these components can help you to both create index and maintain indexes that provide optimal performance for your SQL Server database.

Why Create An Index In SQL? - Can I have Too Many Indexes?

While indexes can improve query performance by providing fast access to data, they also have some overhead associated with them. Every index added to a table requires additional disk space to store the index data, and each index must be updated whenever the underlying table data changes, which can slow down write operations.

Additionally, too many indexes can lead to increased query optimization overhead, as SQL Server must consider a larger number of indexes when generating query plans. This can result in slower query performance due to the extra time required to create index command and evaluate all possible index options.

Therefore, it's important to carefully consider the columns to include in an index and the type of index to use to create index, based on the specific needs unique constraints of the database and its queries. In general, it's recommended to keep the number of indexes per table to a reasonable number and to regularly monitor and optimize index usage to ensure optimal database performance.

Differences Between Versions Of SQL Server

There are several differences between versions of SQL Server regarding indexes. Here are some of the key differences:

Index types:

New index types have been introduced in different versions of SQL Server. For example, SQL Server 2012 introduced columnstore indexes, which are designed for data warehousing workloads, while SQL Server 2014 introduced the memory-optimized non-clustered columnstore index.

Query optimization:

SQL Server has evolved over the years to improve query optimization, which affects how indexes are used. For example, SQL Server 2016 introduced the query store feature, which allows you to troubleshoot query performance problems and monitor query plan changes over time.

Index creation and maintenance:

Different versions of SQL Server have introduced improvements in index creation and maintenance, such as online index rebuilds and defragmentation. For example, SQL Server 2017 introduced the ability to perform online index rebuilds for very large database of tables with minimal downtime.

Performance improvements:

As SQL Server has evolved, it has introduced many performance improvements related to indexing. For example, SQL Server 2019 introduced improvements in the way that it handles indexes on large tables, which can result in significant performance gains.

JSON indexing:

Starting with SQL Server 2016, JSON data can be stored in columns, and starting with SQL Server 2019, indexes can be automatically created on JSON data stored in columns. This allows for efficient querying binary search of JSON data using indexes.

Overall, the evolution of SQL Server has led to many improvements in indexing functionality, which have resulted in improved query performance and scalability.

Type of Indexes In SQL Server

There are several types of indexes in SQL Server, each with its own strengths and weaknesses. Here are some of the most commonly used index types:

Clustered Index:

A clustered row index determines the physical order of the data in a table. A table can have only one clustered index, and it's typically created on the primary key column of the table. When a query is executed that includes the two clustered index key columns in its search criteria, the clustered index can be used to quickly locate the matching rows.

In SQL Server, a clustered creates an index that determines the physical order of the data in a table. It is created on the primary key column of a table by default, but can also be created on other columns. When a clustered index is created, SQL Server physically reorders the data in the table based on the primary key and values used in the indexed column(s).

Here is an example of creating a clustered index on single column indexes a table in T-SQL:

CREATE CLUSTERED INDEX idx_mytable
ON dbo.mytable (id);

In this example, a clustered composite index named idx_mytable is created on the id column of the mytable table. Once the index is created, SQL Server will physically reorder the data in the table based on the values in the id column and composite index.

Clustered indexes are particularly useful for tables that are frequently queried using range searches, such as BETWEEN or >, because they can quickly locate the desired rows based on the physical order of the data. However, because the data is physically reordered when a clustered index is created, inserting new rows or updating existing rows in a table with a clustered index can be slower than in a table without one. Additionally, each table can have only one clustered index.

Non-Clustered Index:

A non-clustered index is a separate data structure, that points to the location of the data. A table can have multiple non-clustered indexes, and they are typically created on columns that are frequently used in search criteria. When a query is executed that includes the non-clustered index columns in its search criteria, the non-clustered index can be used to quickly locate the matching rows.

In SQL Server, a non-clustered index is a separate structure from the table that provides a quick lookup of data based on the values in one or more columns. Unlike a clustered index, a non-clustered index does not determine the physical order of the data in the table. Instead, it contains a copy of the indexed columns and a pointer to the corresponding rows in the table.

Here is an example of creating a non-clustered index on a table in T-SQL:

CREATE NONCLUSTERED INDEX idx_mytable
ON dbo.mytable (col1, col2);

In this example, a non-clustered index named idx_mytable is automatically created, on the col1 and col2 columns of the mytable table. Once the index is created, SQL Server will use it to quickly locate the rows that match a query's search criteria based on the values in col1 and col2.

Non-clustered indexes are particularly useful for tables that are frequently queried using specific columns, as they can quickly locate the desired rows without having to scan the entire table. However, unlike a clustered index, a table can have multiple non-clustered indexes. Additionally, because a non-clustered index is a separate structure from the table, inserting new rows or updating existing rows in a table with one or more non-clustered indexes can be slower than in a table without any such a nonclustered index or with clustered indexes.

Unique Index - No Duplicate Values

A unique nonclustered index is similar to a non-clustered index, but it requires that the indexed columns contain only unique primary key values each. Unique indexes are typically created on columns that have a high degree of uniqueness, such as primary keys or columns that have a unique constraint.

In SQL Server, a unique index enforces a constraint on one or more columns, requiring that each value in the column or index key(s) be unique across all rows in the table. A unique index is similar to a non-clustered index, but with the added constraint that no two rows can have the same values in the indexed column(s).

Here is an example of creating indexed sql table column with a unique index on a table in T-SQL:

CREATE UNIQUE INDEX idx_mytable
ON dbo.mytable (col1, col2);

In this example, a unique index named idx_mytable is created on the col1 and col2 columns of the mytable table. Once the index is created, SQL Server will enforce the constraint that no two rows can have the same values with duplicate entries in col1 and no duplicate values in col2.

Unique indexes are particularly useful for tables that have columns with a high degree of uniqueness, such as a primary key or keys or columns with a unique constraint. They can be created using the CREATE UNIQUE INDEX statement, which is similar to the CREATE INDEX statement used to create non-clustered indexes. Additionally, like non-clustered indexes, a table can have multiple unique indexes.

Full-Text Index:

A full-text index is a full database engine or a database engine or database search engine, used for efficient text searches on columns that contain large amounts of text data, such as document content or comments. Full-text indexes can be used to search for words or phrases, and they support advanced search features such as stemming, thesaurus, and proximity searches.

In SQL Server, a full-text index is a type of index that is used to improve the performance of text-based searches on large amounts of text data, such as documents, articles, or other unstructured data. Unlike a regular index, which indexes only specific columns in a full table or view, a full-text index creates a "word catalog" of the text data, allowing for more efficient searching.

Here is an example of creating a full-text index on a table in T-SQL:

CREATE FULLTEXT INDEX ON dbo.mytable
(col1, col2)
KEY INDEX pk_mytable
WITH STOPLIST = SYSTEM;

In this example, a full-text index is created on the col1 and col2 columns of the mytable table, using the primary key index pk_mytable as the primary key constraint index for the full-text index. The STOPLIST parameter specifies which stop words (commonly used words like "the" or "and") to exclude from the word catalog.

Full-text indexes are particularly useful for text-based searches that involve complex queries or that require high performance on large amounts of data. They can be created using the CREATE FULLTEXT INDEX statement, which is similar to the CREATE INDEX statement used to create regular indexes. Additionally, full-text indexes can only be created on tables that contain text or ntext columns, and they require additional maintenance to keep the word catalog up-to-date.

Spatial Index:

A spatial index is used for efficient searches on columns that contain spatial data, such as points, lines, or polygons. Spatial indexes are used to perform distance calculations, nearest-neighbor searches, and other advanced spatial queries.

In SQL Server, a spatial index is a type of index that is used to improve the performance of spatial data queries on large amounts of spatial data. Spatial data refers to data that has a geographic or spatial component, such as points, lines, or polygons.

Here is an example of basic syntax for creating a spatial index on a table in T-SQL:

CREATE SPATIAL INDEX idx_mytable
ON dbo.mytable (geography_column)
USING GEOGRAPHY_AUTO_GRID;

In this example, a spatial index named idx_mytable is automatically created, on the geography_column column of the mytable table. The USING parameter specifies the spatial index type to use, with GEOGRAPHY_AUTO_GRID indicating that the index should be automatically generated using a grid-based approach.

Spatial indexes are particularly useful for spatial data queries that involve complex spatial relationships, such as proximity searches or intersection queries. They can be created using the CREATE SPATIAL INDEX statement, which is similar to the CREATE INDEX statement used to create regular indexes. Additionally, spatial indexes can only be created on tables that contain spatial data types, such as the geography or geometry data types in SQL Server.

Filtered Index: A filtered index is a non-clustered index that includes only a subset of the rows in a full table or view, based on a filter condition. Filtered indexes can be used to improve query performance on frequently used subsets of data.

Overall, choosing the right index type and columns to include in an index is critical to optimizing query performance in SQL Server.

Creating Indexes - Clustered Index - SSMS

Here is a sort video on how to create clustered indexes in SQL 2022

To create a clustered index in SQL Server Management Studio (SSMS), follow these steps:

Open SSMS and connect to the database where you want to create the clustered index.

In the Object Explorer pane, expand the database where you want to create the clustered index.

Expand the Tables folder and locate the table where you want to insert operations create the clustered index.

Right-click on the table and select "Design" from the context menu.

In the table designer, select the column or columns that you want to include in the clustered index. To select multiple columns, hold down the Ctrl key while clicking on the column names.

Right-click on the selected column(s) and choose "Indexes/Keys" from the context menu.

In the "Indexes/Keys" dialog box, click on the "Add" button to create a new index.

In the "New Index" dialog box, set the following options:

Index name: Enter a name for the clustered index.

Index type: Select "Clustered" from the drop-down menu.

Included columns: Add any additional columns that you want to include in the index.

Click on the "OK" button to create the clustered index.

In the table designer, click on the "Save" button to save the changes to the table.

That's it! You have now created a clustered index on the selected column(s) in SSMS.

Creating Indexes Nonclustered Indexes - SSMS

To create a non-clustered index in SQL Server Management Studio (SSMS), follow these steps:

Open SSMS and connect user database, to the database where you want to create index of the non-clustered index.

In the Object Explorer pane, expand the first database table where you want to create the non-clustered index.

Expand the Tables folder and locate the table where you want to create the non-clustered index.

Right-click on the table and select "Design" from the context menu.

In the table designer, select the column or columns that you want to include in the non-clustered index.

To select multiple columns, hold down the Ctrl key while clicking on the column names.

Right-click on the selected column(s) and choose "Indexes/Keys" from the context menu.

In the "Indexes/Keys" dialog box, click on the "Add" button to create a new index.

In the "New Index" dialog box, set the following options:

Index name: Enter a name for the non-clustered index.

Index type: Select "Nonclustered" from the drop-down menu.

Included columns: Add any additional columns that you want to include in the index.

Click on the "OK" button to create the non-clustered index.

In the table designer, click on the "Save" button to save the changes to the table.

That's it! You have now created a non-clustered index on the selected column(s) in SSMS.

Statistics And Indexes How Are They Related

Statistics and indexes are related in the sense that they both help the SQL Server query optimizer to create an efficient query plan for a given query.

Indexes provide a way for SQL Server to quickly retrieve the data that matches a specific query condition. When an index is created on a column or set of columns, SQL Server stores a copy of all the data in the index, sorted in a specific order. This makes it easier and faster for SQL Server to find the relevant data when a query on specified columns is executed.

Statistics, on the other hand, provide information about the distribution of values in one or more columns of a table. SQL Server uses statistics to estimate the number of rows that will be returned by a query and to determine the most efficient way to retrieve the data. The query optimizer uses statistics to choose the best execution plan for a given query.

When an index is created on a column or set of columns, SQL Server automatically generates statistics for those columns. These statistics help the query optimizer to create an efficient query plan by providing information about the distribution of values in the index. SQL Server uses this information to estimate the number of rows that will be returned by a query and to choose the best execution plan.

In summary, indexes and statistics are both important tools that help SQL Server to optimize queries. Indexes provide a way to quickly retrieve the data that matches a specific query condition, while statistics provide information about the distribution of values in a table, which helps the query optimizer to estimate the number of rows that will be returned by a query and to choose the best execution plan.

Here are some general guidelines for creating an indexing strategy:

Identify the queries that are most important to your application: You should focus on optimizing the queries that are most frequently used and/or the queries that are most resource-intensive. This will help you to prioritize which tables and columns to index.

Identify the columns that are frequently used in WHERE, JOIN, and ORDER BY clauses: These columns are good candidates for indexing because they are used to filter, join, or sort the data.

Choose the right type of index:

SQL Server supports several types of indexes, including clustered indexes, nonclustered indexes, index re-clustered, and filtered indexes. Choose the type of index that best fits your query patterns.

Avoid over-indexing:

Too many indexes can slow down data modifications and increase your storage space requirements. You should only create indexes that are necessary to support your query patterns.

Avoid under-indexing:

Not having enough indexes can slow down query performance. You should create indexes that are necessary to support your query patterns.

Consider using covering indexes:

A covering index includes all of the columns that are needed to satisfy a query. This can reduce the number of disk I/O operations required to execute the same query.

Regularly monitor and tune your indexes: As your data and query patterns change over time, your indexing strategy may need to be adjusted. You should regularly monitor your your database engine's performance and tune your indexes as necessary.

Consider using partitioning:

Partitioning can help to improve the performance of large tables by allowing SQL Server to access only the partitions that are needed to satisfy a query.

In summary, creating an effective indexing strategy requires careful consideration of your query patterns and the index key columns that are frequently used in those queries. By choosing the right type of index and avoiding over-indexing and under-indexing, you can improve query performance and optimize the overall performance of your database.

How Can I Tell If SQL Server is Using My Index

We can see in this image that SQL Server is not using indexes because the query engine is reporting table scans

To tell if SQL Server is using your sql index, you can use the SQL Server Management Studio (SSMS) or run a query to examine the query execution plan of sql index. Here are the steps to follow:

Open SQL Server Management Studio and connect to the database server.

Open a new query window and enter the query that you want to examine.

In the query window, click on the "Query" menu and select "Include Actual Execution Plan".

This will enable the query execution plan to be displayed in the Results tab.

Execute the following query, by clicking the "Execute" button or pressing F5.

Once the query has executed, switch to the "Execution plan" tab in the Results pane to see the execution plan.

Look for the "Clustered Index Seek" or "Index Seek" operators in the execution plan. If you see one of these operators, it means that the query is using an index to locate the data. If you see a "Clustered Index Scan" or "Index Scan" operator instead, it means that the index is not being used and the query is scanning the entire table instead of using the index.

Alternatively, you can use the DMVs (Dynamic Management Views) in SQL Server to check index usage.

Here is an example query:

SELECT OBJECT_NAME(i.object_id) AS TableName, i.name AS IndexName, 
       user_seeks, user_scans, user_lookups, user_updates
FROM sys.dm_db_index_usage_stats s 
JOIN sys.indexes i ON i.object_id = s.object_id AND i.index_id = s.index_id
WHERE OBJECTPROPERTY(i.object_id,'IsUserTable') = 1
AND OBJECT_NAME(i.object_id) = 'YourTableName'
AND i.name = 'YourIndexName'

This query will show you the number of times the index was used for data retrieval, seeking, scanning, or lookups, as well as the number of updates. If the user_seeks value is greater than 0, it means that the index has been used to seek data.