SQLSERVER - Index

1.Introduction to Indexes
Creation of well-chosen indexes can greatly improve database performance.

2.Noclustered Index and Clustered Index
*Nonclustered Index
We know now that a nonclustered index has the following features. A nonclustered index:
☆Is a sorted set of entries.
☆Has one entry per row of the underlying table.
☆Contains an Index Key and a Bookmark.
☆Is created by you.
☆Is Maintained by SQL Server.
☆Is used by SQL Server to minimize the effort required to satisfy a client request.
*Clustered Index
A clustered index is a sorted table whose sequence is specified by you when the index is created, and maintained by SQL Server. Any row in that table is quickly accessible given its key value. Any set of rows, in index key sequence, are also quickly accessible given the range of their keys.

There can be only one clustered index per table. The decision of which columns should be the clustered index key columns is the most important indexing decision that you will make for any table.

3.Pages and Extents
SQL Server does not read rows; it reads data in units of one page or more.  The page, which is the smallest unit of IO, is 8K in size.  An extent is 8 consecutive pages in a data file.  Normally an extent, and therefore its pages, contains rows or entries of a single object; be it a heap or an index.  Because of the efficiency provided by large units of IO, a query must be highly selective to benefit from a nonclustered index.

4.Included Columns
Included columns enable nonclustered indexes to become covering indexes for a variety of queries, improving the performance of those queries; sometimes quite dramatically. Included columns increase the size of an index, but add little else in terms of overhead. Any time you are creating a nonclustered index, especially on a foreign key column,
ask yourself – “What additional columns should I include in this index?”

5.Bookmark
A nonclustered index entry consists of search key columns, included columns, and the bookmark. The bookmark value will be either a RID or the clustered index's key, depending upon whether the table is a heap or a clustered index. Choosing the best clustered index for a table requires that you follow three guidelines to ensure that the index key will make a good bookmark.

6.Filtered Indexes
A filtered index eliminates unusable entries from the index, producing an index that is as beneficial as, but much smaller than, an unfiltered index. An index is filtered by specifying a WHERE clause within the CREATE INDEX statement. The columns specified in the WHERE clause can be different from the columns specified for the index key, or from the columns specified in the INCLUDE clause.

If a certain subset of a table's data is frequently requested, a filtered index can also be a covering index; resulting in a considerable reduction in IO.

Do not create filtered indexes as an alternative to properly modeling entity subtypes in your database design.

7.Unique Indexes
Unique indexes provide support for primary key and alternate key constraints. A unique index may exist with a corresponding constraint, but a constraint cannot exist without its index,

A unique index can also be a filtered index. This allows for enforcement of uniqueness of values in columns that permit multiple NULLs within the column.

The IGNORE_DUP_KEY option influences the behavior of multi-row insert statements.

A unique index can provide for better query performance, even if the index is not used by the query.

8.Reading Query Plans
A query plan shows you the methodology SQL Server intends to use, or has used, to execute a query. It does so by detailing the operations that will be used, the flow of the rows from operation to operation, and the parallelism involved.
☆You view this information as a text, graphical, or XML display.
☆Graphical plans show the relative work load of each operation.
☆Graphical plans may suggest an index that will improve the performance of the query.
☆Understanding query plans will help you evaluate and optimize your index design.

9.Index Internal Structure
The structure of index enables SQL Server to quickly access any entry for a specific index key value. Once that entry has been found, SQL Server can:
☆Access the row for that entry.
☆Traverse the index from that point in either ascending or descending sequence.
This indexing tree structure has been in use for a long time, longer even than relational databases, and it has proven itself over time.

10.Index Fragmentation 
Fragmentation can be either internal or external.
Internal fragmentation refers to empty space left within the index pages.
External fragmentation refers to the deviation of the logical sequence (index key order) of an index’s pages from the physical storage sequence.
External fragmentation can slow the performance of index scans. It has no impact retrieving a small number of rows based on index key.
Internal fragmentation increases the size of the index.
Page splitting is extra overhead and causes external fragmentation. Unnecessary page splitting should be avoided. The best way to avoid page splitting and external fragmentation is to periodically rebuild / reorganize the index to produce an index that will have the best page fullness, the least amount of external fragmentation, and smallest amount of page splitting for the foreseeable future.

11.Create Alter Drop
CREATE INDEX enables you to create an index and specify its options.
ALTER INDEX is used to create, disable, rebuild, reorganize, and delete an index; and change its option settings.
ALTER INDEX cannot be used to add or remove columns from an index; CREATE INDEX with the DROP_EXISTING must be used to accomplish this change.
Reorganizing an index requires less time, resources, and exclusive access than rebuilding it; but is less effective at reducing fragmentation.
Disabling a nonclustered index frees its extents and makes the index invisible to SQL queries. Disabling a clustered index frees its non-leaf levels and makes the table inaccessible. A disabled index can only be rebuilt or deleted. Rebuilding a deleted index requires less disk space than rebuilding an index that has not been deleted.
Many option values can be reset only by doing a rebuild.
Creating or dropping the table’s clustered index causes all its nonclustered indexes to be rebuilt.
When a table has multiple indexes, create the clustered index first, then the nonclustered indexes. When dropping those same indexes, drop the clustered index last.
Dropping a clustered index, does not drop the table. The table reverts to being a heap, and the nonclustered indexes are rebuilt.

12.Insert Update Delete
Inserting entries into an index will result in one of three fragmentation patterns, depending upon the sequence of the inserted entries.

Deleting entries from an index, including a clustered index, might not remove the entries immediately. Rather, it may create ghost records by marking the index entries as logically deleted. Ghosting only occurs at the leaf level. SQL Server will remove ghosted entries at a later time, but not until their transaction has completed.

Updating of index entries might be done as an in-place UPDATE or as a DELETE followed by an INSERT. If there is no DML trigger on the underlying table and if the update does not relocate nor increase the size of the entry, an in-place update usually occurs.

If the data modification statement will affect a large number of rows, SQL Server may choose to do an index-at-a-time UPDATE; modifying the table first and then applying the changes to each index separately and in index key sequence.

13.Index Statistics
Index statistics help SQL Server optimize queries.
Index statistics are generated whenever an index is created or rebuilt.
Index statistics are divided into three parts, and can be viewed by executing:
  DBCC SHOW_STATISTICS
Index statistics can become out-of-date. If they do, they can result in poor decisions by the query optimizer.
By default, index statistics are regenerated automatically. You can turn off this automatic regeneration. You can also force a manual regeneration of statistics for an index or for a table.
Regenerating index statistics is normally done by sampling the index, rather than by reading the entire leaf level of the index. You can control the size of the sample that is used to regenerate an index’s statistics.

14.Indexing Best Practices
*Do Not Over Index
Too many indexes are as bad a problem as too few indexes. There is no magic ‘best number of indexes’ for a table. Every table is different. However, once you have indexed the primary key, any candidate keys, and the appropriate foreign keys; any other potential index requires careful analysis on your part before you add it to your database.
*Do Have a Primary Key on Every Table
Although primary keys are not required by SQL Server, a table without a primary key is dangerous thing in a transactional or reporting database, as its rows are not guaranteed to be unique. If duplicate rows are allowed, they will happen; and you will never know whether the same instance of a subject was entered twice, or whether you have separate instances with insufficient information to distinguish one from the other.

Although not required by SQL Server, primary keys are a cornerstone of relation theory, the basic building block of all relational systems. Without primary key constraints, and their associated UNIQUE indexes, relational operations will yield unexpected results and poor performance.

In addition, many client side development tools and components need your tables to have primary keys. For example, both the ADO.Net SqlCommandBuilder component and Visual Studio’s Entity Data Modeler depend upon targeted database tables having primary key constraints. Remember that the name of the primary key constraint becomes the name of the index that is automatically created to enforce that constraint.
*Consider Having a Clustered Index on Every Table
This Stairway’s Level 3 – Clustered Indexes covered the benefits of having a clustered index on a table; that is, of having the table be a clustered index rather than be a heap. The main benefit is the simple fact that the user community as a whole tends to view a table’s data in a certain default sequence; hence the advantage of maintaining the rows in that sequence.

If you follow these recommendations laid out in this Level, every table will have a primary key. Therefore, every table will have at least one index, and probably more. Thus, making one of those indexes the clustered index will not increase the number of indexes, but it will just give your table a better structure than that of a heap.

When deciding on the clustered index key, remember the guidelines specified in Level 6 – Bookmarks: a clustered index key should be unique, short, and non-volatile.
*Avoid Nonclustered, Unfiltered Indexes on Columns that have few Distinct Values
The old cliché is “Never index the Gender column”. A typical page of the table will have half female rows and half male rows, and will be accessed whether the request is for female rows or for male rows. A table scan will always be the best decision for any WHERE GENDER = … query; therefore, such an index will never be of benefit to the optimizer.
*Consider Creating a Filtered Index for Columns that Have a Dominate Value
If a large percentage of the rows have the same value for a particular column, or are null, create a filtered index that on that column. Those queries searching for rows of the rarer values will use this small, efficient index; queries looking for the rows of the common value will do a table scan. And SQL Server can easily determine which is which.
*Do Create a Table’s Clustered Index Before Creating its Nonclustered Indexes
A corollary to this guideline is: Do drop a table’s nonclustered indexes before dropping its clustered index. Doing otherwise will cause the nonclustered indexes to be rebuilt unnecessarily. Transitioning a table between being a heap and being a clustered index always causes the table’s non-clustered indexes to be rebuilt because the contents of the bookmarks must change from row identifiers to clustered index keys.
*Do Plan Your Index Defragmenting and Rebuilding Based Upon Usage

If an indexed is frequently scanned, then, as mentioned in Level 11 – Index Fragmentation, the external fragmentation of that index is important; for it has a major impact on the effort required to scan all or some of the leaf level of an index. If this is the case, consider reorganizing the index when external fragmentation reaches ten percent, and consider rebuilding the index when external fragmentation reaches thirty percent. For most transactional environments, the values mentioned above represent the point at which the benefit of performing the reorganization or rebuild of the index outweighs the cost of doing it.

However, if an index is being searched for a specific key value, then external fragmentation has little or no impact on performance. The IO required to traverse one page at each level from the root page to the leaf level, will be the same regardless of external fragmentation. In this situation, reorganizing or rebuilding the index will have little impact on performance.


  
引自:http://www.sqlservercentral.com/stairway/72399/

你可能感兴趣的:(sqlserver)