What Are Indexes In SQL Server

Jan 13, 2023

In SQL Server, an index is a data structure that improves the performance of queries by allowing the database engine to quickly locate and retrieve the requested data. An index is similar to an index in a book, it allows you to quickly find a specific page based on a specific keyword or value. Indexes in SQL Server can be created on one or more columns of a table or view, and they can be of different types, such as:

Clustered index: A clustered index determines the physical order of data in a table. A table can have only one clustered index, because the data rows themselves can be stored in only one order.
A non-clustered index does not affect the physical order of the data rows, but it contains a copy of the indexed columns and a pointer to the actual data row. A table can have multiple non-clustered indexes.
Unique index: A unique index ensures that no two rows of a table have duplicate values in the indexed column(s).
Full-Text index: A full-text index is used to improve the performance of full-text searches on large text columns.
Columnstore index: A columnstore index is a type of non-clustered index that is optimized for data warehousing scenarios and it's designed to work with very large data sets.
XML index: An XML index allows you to create a non-clustered index on an XML data type column in a table.
Spatial index: A spatial index is a type of index that is used to improve the performance of spatial queries on data that is stored in a geometry or geography data type column.

TLDR - Indexes OR Detailed Info :)

A clustered index in SQL Server is a type of index that determines the physical order of data in a table. In other words, it determines how the data rows are stored on disk. A table can have only one clustered index, because the data rows themselves can be stored in only one order. When a table is created, the primary key is automatically used as the clustered index unless another column or set of columns is specified as the clustered index. If a clustered index is not defined, then a unique, non-clustered index is created by default and is known as a heap.

Clustered Index

A clustered index is created on a column or set of columns that have a high degree of uniqueness and are often used in queries to retrieve data. The column or set of columns used for the clustered index are known as the key columns. The values in the key columns are used to order the data rows in the table. The clustered index also includes a non-clustered index, which contains a copy of the key columns and a pointer to the actual data row, this allows the database engine to quickly locate and retrieve the requested data. Queries that use the key columns of the clustered index to filter or sort the data will generally perform better than those that don't. However, having too many clustered indexes on a table can cause performance issues because of the extra overhead of maintaining them. It's important to consider the usage of the table, the number of inserts, updates, and deletes and the selectivity of the column before creating a clustered index.

Non-clustered indexes

Non-clustered indexes are generally used to improve the performance of queries that filter or sort the data based on columns that are not included in the clustered index. They are also used to enforce unique constraints on columns that are not the primary key. Non-clustered indexes can also be filtered, this means that the index only includes a subset of the rows in the table, allowing the database engine to quickly locate and retrieve the requested data from a smaller set of rows. Like clustered indexes, non-clustered indexes also have their own trade-offs, creating too many non-clustered indexes can also cause performance issues due to the extra overhead of maintaining them. It's important to consider the usage of the table, the number of inserts, updates, and deletes and the selectivity of the column before creating a non-clustered index.

Unique Index

A unique index in SQL Server is a type of index that ensures that no two rows of a table have duplicate values in the indexed column(s). It is similar to a non-clustered index, but it also enforces the constraint of unique values in the indexed columns. When a unique index is created, the database engine checks that there are no existing rows with duplicate values in the indexed column(s) and it will prevent new rows with duplicate values from being inserted. If a new row is inserted that would cause a duplicate value, the insert will fail and an error will be returned. A unique index can be created on one or more columns of a table. The columns that are included in a unique index are known as the key columns. The values in the key columns are used to ensure the uniqueness of the data. Unique indexes are commonly used to enforce unique constraints on columns that are not the primary key, such as a unique identifier or an email address. They can also be used to improve the performance of queries that filter or sort the data based on unique values in the indexed columns. It's important to note that a unique index can be created on a nullable column, however, it will allow only one NULL value. Like other types of indexes, creating too many unique indexes can cause performance issues due to the extra overhead of maintaining them. It's important to consider the usage of the table, the number of inserts, updates, and deletes and the selectivity of the column before creating a unique index.

Spatial Index

A spatial index in SQL Server is a type of index that is used to improve the performance of spatial queries on data that is stored in a geometry or geography data type column. Spatial data is data that represents the position and shape of objects in two-dimensional space, such as points, lines, and polygons. Spatial indexes are designed to work with very large sets of spatial data, allowing the database engine to quickly locate and retrieve the requested data. When a spatial index is created, it is built using a grid-based data structure called a spatial indexing grid. The grid divides the spatial data into a set of smaller, regularly-shaped cells, which makes it possible to quickly locate and retrieve the requested data. Spatial indexes are used in geographic information systems (GIS) and other applications that involve working with spatial data, such as location-based services, transportation systems, and environmental monitoring. They can be used to perform operations such as spatial joins, spatial filtering, and spatial analysis on large sets of spatial data. Spatial indexes are also used to enforce spatial constraints, such as ensuring that a point is within a specific area or that two lines do not intersect. Spatial indexes can greatly improve the performance of spatial queries, but they also add overhead to data modification operations, such as INSERT, UPDATE, and DELETE. It's important to consider the usage of the spatial data, the number of inserts, updates, and deletes and the size of the data before creating a spatial index.

Full-Text Index

A full-text index in SQL Server is a type of index that is used to improve the performance of full-text searches on large text columns. Full-text search allows you to search for specific words or phrases in large amounts of text data, such as documents, emails, or articles. When a full-text index is created, the database engine uses a process called full-text indexing to create a separate index of the text data. This index contains a list of the words and phrases that appear in the text data, along with their frequency and location. Full-text indexing is performed on the text data column and it can be done on one or more columns. The indexed columns are called Full-Text indexed column. A full-text search query can then be used to search the index for specific words or phrases, allowing the database engine to quickly locate and retrieve the requested data. The full-text search query can include wildcard characters and Boolean operators, such as "AND" and "OR", to further refine the search. Full-text indexes can be used to improve the performance of searches on large text columns, such as those that contain documents, emails, or articles. They can also be used to improve the performance of searches on columns that contain large amounts of unstructured data, such as product descriptions, resumes, or customer feedback. It's important to note that full-text indexing requires additional disk space, and also it's not available on all editions of SQL Server. Full-text indexes are not recommended for small text columns or for columns that are frequently updated.

Columnstore Index

A columnstore index in SQL Server is a type of non-clustered index that is optimized for data warehousing scenarios and it's designed to work with very large data sets. It uses a column-based storage approach, which is different from the traditional row-based storage approach used by other types of indexes. When a columnstore index is created, it stores the data in columns rather than rows. This allows the database engine to compress the data more effectively and to retrieve only the specific columns that are needed for a query, rather than the entire row. Columnstore indexes are best suited for large data warehousing scenarios, where large amounts of data need to be analyzed and where queries typically retrieve a small subset of the data. They can also be used for large reporting scenarios, where the queries are typically read-only and where the performance of the queries is more important than the performance of data modification operations. Columnstore indexes can greatly improve the performance of data warehousing queries, but they also add overhead to data modification operations, such as INSERT, UPDATE, and DELETE. They are not recommended for small tables or for tables that are frequently updated. It's important to note that columnstore indexes are not available on all editions of SQL Server, and they also require additional disk space.