Business Intelligence Documents: July 2013

Thursday, July 25, 2013

Covering index in sql server

Covering indexes are one of the best feature in nonclustered indexes. Before covering indexes were introduced if you want multiple column search in some queries then you have to create nonclustered index on multiple column. And now specify you first column for nonclustered key value and rest column list goes with include clause.

Covering indexes are nonclustered index with include clause. Because creating wide or composite nonclustered index is lack of performance compare to covering index because it requires more storage space at top level and intermediate level of index b-tree.

CREATE NONCLUSTERED INDEX <index-name> ON <table-name> (<column-name>)
INCLUDE (<Columns>)

Above covering index is much faster in terms of execution compare to

CREATE NONCLUSTERED INDEX <index-name> ON <table-name> (<columns>)

We can take example:

Create table AB
(
id int,
name nvarchar(255)
)

create nonclustered index IX_AB on AB(id,name)

select * from AB where id = 1 and name = 'AA'

create nonclustered index CIX_AB on AB(id) include(name)

select * from AB where id = 1 and name = 'AA'

Indexes in sql server

When performance matters indexes comes first in terms of query performance. Poorly designed indexes and a lack of indexes are primary sources of database application bottlenecks. Designing efficient indexes is paramount to achieving good database and application performance.

Large numbers of indexes on a table affect the performance of INSERT, UPDATE, DELETE, and MERGE statements because changes on table also make changes on indexes. If we have more select statements like we have in Data Warehouse than heavily indexes are preferable.

Indexes are one kind of pattern of storing data for faster retrieval. When you create an index on table it basically creates B-Tree (Balanced Tree) structure. It follows format for easier search of data.

When you do not have any index on table than data get stored in random fashion which is also called table on Heap. Data retrieval are much slower when you compare table on a heap and table on an index.

In B-tree structure top level and intermediate level have nodes which stored key values based on what you are creating indexes. On the bottom level which is called leaf nodes may have data or pointers to other indexes key values.

When you search data in B-tree it starts from top to search for values and find the according values.

In indexes we have two major types of indexes:

Clustered index and nonclustered index.

Clustered Index:

When you create a clustered index on a table than original storage structure will be thrown and it creates b-tree structure. First it sorts data on clustering key value. On leaf node of b-tree it actually stores all rows of data.

So, when you have clustered index on table and you are disabling index you are not able to retrieve any data and you cannot perform any DML operations on that table.

Having multiple clustering index doesn't make any sense. For that reason we have only 1 clustered index per table.

Creating clustered index in sql server
CREATE CLUSTERED INDEX <index-name> on <table-name> (<column-name>)

Nonclustered index:

Non clustered index is much different than clustered index. It sorts data based on your clustering key value and logically stores.

If you already have clustered index on table and you are creating nonclustered index on that table than on the leaf node you have nonclusered key value and pointer to clustered key value.

If you are creating nonclustered index on heap table than its called nonclustered index on heap. In leaf node we have nonclustered key value and pointer to heap row which is also called RID (Row Identifier).

Creating nonclustered index in sql server
CREATE NONCLUSTERED INDEX <index-name> ON <table-name>(<column-name>)

We can have clustered and nonclustered indexes on single or multiple column. Creating indexes on multiple column also known as WIDE index or composite key index.

Friday, July 12, 2013

Transactions in T-SQL

Transaction is a logical unit which contains set of statements. Transactions are made in terms of following ACID properties in database.

ACID properties is one of major requirement in database to avoid deadlocks. Atomicity, Consistency, Isolation and Durability.

Atomicity: All T-SQL statements must be executed all together or nothing should be executed inside a transaction.
Consistency: All transactions must leave the data in a consistent state after execution.
Isolation: Modifications made by concurrent transactions must be isolated from the modifications made by any other concurrent transactions. ie one transaction should not modify the same data while the other transaction is still running on it.
Durability: If a transaction is committed it must be committed on permanent basis. Once committed it persists.

Types of Transactions in T-SQL

Autocommit transactions: Each individual statement is a transaction.
Explicit transactions: Each transaction is explicitly started with the BEGIN TRANSACTION statement and explicitly ended with a COMMIT or ROLLBACK statement.
Implicit transactions:A new transaction is implicitly started when the prior transaction completes, but each transaction is explicitly completed with a COMMIT or ROLLBACK statement.
Batch-scoped transactions:Applicable only to multiple active result sets (MARS), a Transact-SQL explicit or implicit transaction that starts under a MARS session becomes a batch-scoped transaction. A batch-scoped transaction that is not committed or rolled back when a batch completes is automatically rolled back by SQL Server.

Actions performed on Explicit Transactions

BEGIN TRANSACTION: Marks the starting point of an explicit transaction for a connection.
COMMIT TRANSACTION: Used to end a transaction successfully if no errors where encountered. All DML changes made in the transaction become permanent.
ROLLBACK TRANSACTION: Used to erase a transaction in which errors are encountered.All DML changes made in the transaction are undone.