Welcome to TechNet Blogs Sign in | Join | Help

Storage Consideration for SQL Server 2005 DW environment

Storage design is a key component of data warehouse/BI environment. Optimum storage design provides significant performance. SQL Server performance depends heavily on the I/O subsystem. Unless your database fits into physical memory, SQL Server constantly brings database pages in and out of the buffer pool. This generates substantial I/O traffic. Similarly, the log records need to be flushed to the disk before a transaction can be declared committed. And finally, SQL Server uses tempdb for various purposes such as to store intermediate results, to sort, to keep row versions and so on. So a good I/O subsystem is critical to the performance of SQL Server.

Following key points should be considered while designing storage system for consolidated SQL Server environment.

  • RAID 1/0 (or RAID 10) gives excellent random read/write performance. RAID 10 is ideal for OLTP environment with lots of small random read/writes. This RAID type also provides good fault tolerance since it can survive the failure of up to half of the disks, provided one disk in each mirror image pair survives. Use RAID 10 if more than 30 percent of the IO is small random writes and if budget permits. RAID 10 is recommended for OLTP Data, Transaction Log, and Tempdb data volumes.
  • RAID 5 gives excellent read performance, especially large sequential I/O. But it provides lower random write performance. It also delivers flower fault tolerance than RAID 10 since it can tolerate only one drive failure per RAID 5 LUN. Also, in the event of a drive failure, the time for the storage system to rebuild the content of the failed drive is longer than RAID 10. RAID 5 costs less for the same storage capacity compared to RAID 10. RAID 5 is ideal for Backup volumes.
  • SQL Server log files are accessed sequentially and are write intensive. Since RAID 5 requires 4 I/O per write (write data, validate data, write parity, and validate parity), placing the log files on RAID 5 array greatly increases the possibility for an I/O bottleneck. Placing the data file and log files on the same drive spindles creates contention for the drive heads between the sequential movement required for the log writes versus the random drive head movement required to read and write data. Because of this, the log file should not be placed on the same set of spindles as the data files. The log files should be placed on a RAID 1+0 arrays.
  • SQL Server 2005 makes greater use of tempdb than SQL Server 2000. For high performance applications, tempdb should be placed on a dedicated RAID 1+0 array and sized large enough so it does not auto grow. TempDB has a large number of objects being created all the time. For an object to be created, space must be allocated to it. Space allocation is determined by looking at some of the internal system pages (the GAM, and SGAM). In the end, it is these pages that start to have significant contention (with just one file) in a VERY active TempDB. To minimize that contention you should create multiple files. The guideline is to create as many numbers of files as the number of processor cores on the system. So, for example if you have dual core 8 way processor, you should have 16 files for your tempdb database.
  • Backups should be made to drives other than the ones used for data and log not only to separate out I/Os, but physically separate drives means that even if the data or log disks fail, the backup drive will most likely still exist. Writing the backup file to the same set of disks means that the data file is being read from and the backup file is written to all using the same I/O pool. It also puts available space at risk since backups can fill up drives quickly without a proper retention policy to manage them.
  • All versions of SQL Server write in 8k pages and the read ahead is 64k. The recommendation is to format any disks which may be used for SQL Server data using a 64k block size. Using defaults for NTFS (which is not 64K) will hamper SQL Server performance. Even if there is the potential that a disk may be used for SQL Server data, format it with a 64k block size.
  • SQL Server on non-optimized and shared SAN where many different applications and I/O profiles access the same spindles can result in performance issues. Sections of the storage should be reserved for high performance SQL Server requirements and carved up at the time of deployment of the particular SQL Server solution. These storage allocations should not come from the general storage pool.
  • Sector alignment. Sector alignment can increase the I/O performance of SQL Server by up to 20%. If sector alignment is required, this will be done via the command line utility DISKPART in Windows Server 2003 prior to formatting the disks with NTFS.
Posted by vishah | 2 Comments

Data Protection Manager - an ultimate solution for protecting SQL Server data

I learned about new tool from Microsoft System Center family called Data Protection Manager 2007.   Here is some details on how it works and how can it save big bucks when comparing with other backup compression tool in market such as LiteSpeed.

Below is its overall architecture:

DPM Architecture 

How does SQL Server protection with DPM 2007 work?

DPM 2007 uses a combination of transaction log replication and block-level synchronization in conjunction with the SQL VSS Writer to help ensure your ability to recover SQL Server databases. After the initial baseline copy of data, two parallel processes enable continuous data protection with integrity:

  • Transaction logs are continuously synchronized to the DPM 2007 server, as often as every 15 minutes.
  • An “express full” uses the SQL Server VSS Writer to identify which blocks have changed in the entire production database, and send just the updated blocks or fragments. This provides a complete and consistent image of the data files on the DPM server or appliance. DPM 2007 maintains up to 512 shadow copies of the full SQL Server database(s) by storing only the differences between any two images.

Assuming one “express full” per week, stored as one of 512 shadow copy differentials between one week and the next, plus 7 days x 24 hours x 4 (every fifteen minutes), DPM 2007 can provide over 344,000 data consistent recovery points for SQL.

 

How does Restore work?

With only a few mouse clicks and DPM 2007, you can:

  • Restore a SQL database directly back to the original server
    • Databases are recovered back where they came from into the active SQL Server, with no additional work for the SQL administrator to do afterwards. This improves recovery time and reduces the number of people needed during a crisis recovery.
  • Restore to a “recovery database” on the original server
    • This provides an alternative where the active SQL database is untouched, but the older data can be restored onto the same SQL server but in an alternate database – allowing both versions of the data to be accessed independently but concurrently.
  • Copy database files to an alternate server,
    • This option enables disaster recovery, compliance auditing, or software testing – without affecting the production environment
  • Copy the database file to tape,
    • Use this option to create a long-term archive or portable media of the data at any recovery point, even after the fact. For example, if the accounting department closes the quarterly books at 10:45 AM on a Tuesday, you can “recover” the data from that point, directly to tape – without backing up anything else or impacting the production server that is already serving new data.

Comparison of DPM 2007 with other Compressed Backup Utility:

Customer Scenario – Protect 4 Databases totalling 250 GB for 14 days:

  • With good compression efficiency of 70%, like LiteSpeed:
  • 30% additional disk space required on a production server’s local disk, for nightly compressed backups – which are then copied to the backup server at 75GB per night.
  • Total additional disk to protect 250 GB for 14 days with LiteSpeed = 1.12 TB (75 GB on local production server plust 1.05 TB for 2 weeks, 75 GB each day)
  • With DPM2007, you don’t need additional local disk space.
  • On DPM server, DPM replica “Express Full” which is exact replica of databases will consume 250 GB. In addition to that with DPM daily transaction logs at 10% change rate, it will require 25 GB on daily basis. So, to keep two weeks (14 days) of data on DPM server you need 575 GB (250 + (25 * 13)) as compared to 1.12 TB for LiteSpeed.

DPM - comparison1

                

         DPM - comparison2

DPM - comparison3

Posted by vishah | 1 Comments

A Great SQL Resource - SQLCommunity.com

I came across www.SQLCommunity.com - a great resource for SQL Server.   

It has tones of information including Scripts and Tools, Tips, Tricks and Techniques, lots of info on SQL Events, etc.  This site is run by Saleem Hakani - a Microsoft Database Engineer.  I know Saleem very well, he is very passionate about bringing SQL awareness world wide.   He organizes many internal SQL events such as SQLFest, SQLSummit.   I have attended one his event, and I must say, in just one day I got lot out of that event.   I highly recommend to put his site in your SQL tool box.   It could be your one stop shop for all you need from SQL Server.

Posted by vishah | 0 Comments

Finding Missing Indexes and Unused Indexes using DMVs

Finding Missing Indexes:

Dynamic management object

Information returned

sys.dm_db_missing_index_group_stats

Returns summary information about missing index groups, for example, the performance improvements that could be gained by implementing a specific group of missing indexes.

sys.dm_db_missing_index_groups

Returns information about a specific group of missing indexes, such as the group identifier and the identifiers of all missing indexes that are contained in that group.

sys.dm_db_missing_index_details

Returns detailed information about a missing index; for example, it returns the name and identifier of the table where the index is missing, and the columns and column types that should make up the missing index.

sys.dm_db_missing_index_columns

Returns information about the database table columns that are missing an index.

Note:

When the metadata for a table changes, all missing index information about that table is deleted from these dynamic management objects. Table metadata changes can occur when columns are added or dropped from a table, for example, or when an index is created on a column of a table.

When SQL Server is restarted, all of the missing index information is dropped.

Example using DMVs:

Query:

SELECT mig.index_group_handle,mid.index_handle, migs.avg_total_user_cost as AvgTotalUserCostThatCouldbeReduced, migs.avg_user_impact as AvgPercentageBenefit,

'CREATE INDEX missing_index_' + CONVERT (varchar, mig.index_group_handle) + '_' + CONVERT (varchar, mid.index_handle)

  + ' ON ' + mid.statement

  + ' (' + ISNULL (mid.equality_columns,'')

    + CASE WHEN mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL THEN ',' ELSE '' END

    + ISNULL (mid.inequality_columns, '')

  + ')'

  + ISNULL (' INCLUDE (' + mid.included_columns + ')', '') AS create_index_statement

    FROM sys.dm_db_missing_index_groups mig

INNER JOIN sys.dm_db_missing_index_group_stats migs ON migs.group_handle = mig.index_group_handle

INNER JOIN sys.dm_db_missing_index_details mid ON mig.index_handle = mid.index_handle

--where statement = '[<DBName>].[dbo].[<TableName>]'

 

Finding Unused Indexes:

 

DMVs used: sys.dm_db_index_usage_stats

Example to find Unused Indexes:

select object_name(i.object_id) as ObjectName, i.name as [Unused Index]

from sys.indexes i

left join sys.dm_db_index_usage_stats s on s.object_id = i.object_id

      and i.index_id = s.index_id

      and s.database_id = db_id()

where objectproperty(i.object_id, 'IsIndexable') = 1

AND objectproperty(i.object_id, 'IsIndexed') = 1

and s.index_id is null -- and dm_db_index_usage_stats has no reference to this index

or (s.user_updates > 0 and s.user_seeks = 0 and s.user_scans = 0 and s.user_lookups = 0) -- index is being updated, but not used by seeks/scans/lookups

order by object_name(i.object_id) asc

In the above query, user_seeks, user_scans, and user_lookups counters indicate the usage of the particular index.  If the value of the counters is 0, it means that index has never been used for any Seek, or Scan or Lookup operations. 

Every individual seek, scan, lookup, or update on the specified index by one query execution is counted as a use of that index and increments the corresponding counter in this view (DMV).

The user_updates counter indicates the level of maintenance on the index caused by insert, update, or delete operations on the underlying table or view.

The counters are initialized to empty whenever the SQL Server (MSSQLSERVER) service is started. In addition, whenever a database is detached or is shut down (for example, because AUTO_CLOSE is set to ON), all rows associated with the database are removed.

When an index is used, a row is added to sys.dm_db_index_usage_stats if a row does not already exist for the index. When the row is added, its counters are initially set to zero.

 

Posted by vishah | 0 Comments

DDL Triggers and LOGON Triggers

DDL Triggers: 

DDL triggers can be used for administrative tasks such as auditing and regulating database operations. Use DDL triggers when you want to do the following:

¾  You want to prevent certain changes to your database schema.

¾  You want something to occur in the database in response to a change in your database schema.

¾  You want to record changes or events in the database schema.

DDL triggers fire only after the DDL statements that trigger them are run. DDL triggers cannot be used as INSTEAD OF triggers.

Example of DDL Trigger:

This example tracks the Login Events on a server level.   It uses DDL_LOGIN_EVENTS event group from server scope events from above diagram.    It fires against DDL Login activities such as CREATE, ALTER, DROP Login.   

 

CREATE Trigger [Trg_TrackLoginManagement]

on ALL Server

for DDL_LOGIN_EVENTS

as

 

set nocount on

declare @data xml,

              @EventType varchar(100),

              @EventTime datetime,

              @ServerName varchar(100),

              @AffectedLoginName varchar(100),

              @WhoDidIt varchar(100),

              @EmailSubject varchar(500),

              @EmailBody varchar(800),

              @EmailRecipients varchar(300)

 

set @EmailRecipients = 'DBAS@microsoft.com'

 

set @data = eventdata()

set @EventType = @data.value('(/EVENT_INSTANCE/EventType)[1]', 'varchar(100)')

set @EventTime = @data.value('(/EVENT_INSTANCE/PostTime)[1]','datetime')

set @ServerName = @data.value('(/EVENT_INSTANCE/ServerName)[1]','varchar(100)')

set @AffectedLoginName = @data.value('(/EVENT_INSTANCE/ObjectName)[1]','varchar(100)')

set @WhoDidIt = @data.value('(/EVENT_INSTANCE/LoginName)[1]','varchar(100)')

 

insert into msdb..TrackLoginManagement values (@EventType,@EventTime,@ServerName,@AffectedLoginName,@WhoDidIt)

 

set @EmailSubject = 'ALERT: DDL_LOGIN_Event: ' + @EventType + ' occured by ' + @WhoDidIt + ' on ' + @ServerName

set @EmailBody =  'DDL_Login_Event: ' + @EventType + char(10) +

                     'Event Occured at: ' + convert(Varchar, @EventTime) + char(10) +

                     'ServerName: ' + @ServerName + char(10) +

                     'Affected Login Name:      ' + @AffectedLoginName + char(10) +

                     'Event Done by: ' + @WhoDidIt

 

EXEC msdb.dbo.sp_send_dbmail

    @recipients = @EmailRecipients,

    @body = @EmailBody,

    @subject = @EmailSubject ;

 

print @Eventtype + ' activity completed successfully.'

GO

 

LOGON Triggers:

Similar to DDL Triggers, LOGON Triggers fire stored procedures or T-SQL statements in response to LOGON events.    Logon triggers fire after the authentication phase of logging in finishes, but before the user session is actually established. Therefore, all messages originating inside the trigger that would typically reach the user, such as error messages and messages from the PRINT statement, are diverted to the SQL Server error log. Logon triggers do not fire if authentication fails.

You can use logon triggers to audit and control server sessions, such as by tracking logon activity, restricting logons to SQL Server, or limiting the number of sessions for a specific logon.  

Example of LOGON Trigger:

In the following code, the logon trigger captures information from EventData() functions such as LoginName, HostName, LoginTime, and LoginType when any logon activity happened by a SQL Login which are connecting from hosts other than application servers.  This is useful to monitor people connecting to SQL Server using application ids during peak hours and possibly causing performance issues by running inefficient queries outside of an application.

 

create table myTest

(LogonTime datetime,

LoginName varchar(50),

ClientHost varchar(50),

LoginType varchar(50)

)

 

create TRIGGER myTest_LogonTrigger

ON ALL SERVER WITH EXECUTE AS 'sa'

FOR LOGON

AS

BEGIN

       declare @LogonTriggerData xml,

                     @EventTime datetime,

                     @LoginName varchar(50),

                     @HostName varchar(50),

                     @LoginType varchar(50)

       set @LogonTriggerData = eventdata()

 

      

       set @EventTime = @LogonTriggerData.value('(/EVENT_INSTANCE/PostTime)[1]', 'datetime')

       set @LoginName = @LogonTriggerData.value('(/EVENT_INSTANCE/LoginName)[1]', 'varchar(50)')

       set @HostName = @LogonTriggerData.value('(/EVENT_INSTANCE/ClientHost)[1]', 'varchar(50)')

       set @LoginType = @LogonTriggerData.value('(/EVENT_INSTANCE/LoginType)[1]', 'varchar(50)')

 

       if @LoginType = 'SQL Login' and @HostName not in ('<comma separated list of hostname>')

              and @LoginName not in ('<comma separated list of SQL logins>')

              insert into master..myTest values (@EventTime, @LoginName, @HostName, @LoginType)

end

go

 

Posted by vishah | 0 Comments

Plan Cache Analysis in SQL Server 2005

Plan Cache: The portion of the memory pool allocated to store Query execution Plan

For how much memory is allocated for Plan Cache in different version of SQL server, please refer to my blog: - Understanding Query Plan Cache in SQL 2005 SP2 - why it's changed from SQL 2005 RTM/SP1?

A stored execution plan can be one of the following:

  • Compiled – the actual instructions describing how SQL Server will implement a query
  • Execution Context – run-time information unique to a user:
    • Values of local variables
    • Object ID’s of temporary objects
    • Which statement currently being executed
  • Cursors – execution state of server-side cursors
  • Parse Tree – Aids in parsing the query. Usually not cached.

In SQL Server 2005 there are four major DMVs which can be very useful in Plan Cache Analysis:

1. sys.dm_exec_requests

shows information about all running processes (similar to sysprocesses in SQL 2000) along with sql_handle, and plan_handle which can be utilized as an input in following DMVs

example:

SELECT

    sder.session_id AS [SPID],

       sder.sql_handle as [SQL_Handle],

       sder.plan_handle as [PLAN_Handle],

    sdes.login_name AS [Login],

    sd.name AS [DBName],

    sder.start_time AS [Start Time],

    sder.status AS [Status],

    sder.command AS [Command],

    sdet.text AS [SQL Text],

    sder.percent_complete AS [Pct Cmplt],

    sder.estimated_completion_time AS [Est Cmplt Time],

    sder.wait_type AS [Wait],

    sder.wait_time AS [Wait Time],

    sder.last_wait_type AS [Last Wait],

    sder.cpu_time AS [CPU Time],

    sder.total_elapsed_time AS [Total Elpsd Time],

    sder.reads AS [Reads],

    sder.writes AS [Writes],

    sder.logical_reads AS [Logical Reads]

FROM

    sys.dm_exec_Requests sder

    CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS sdet

    JOIN sys.dm_exec_sessions sdes on sder.session_id = sdes.session_id

    JOIN sys.databases sd on sder.database_id = sd.database_id

WHERE

    sder.session_id <> @@SPID and sder.session_id > 50

 

2. sys.dm_exec_query_plan (plan_handle)

returns the plan in XML format.   You can save the XML format query plan with extension  ".sqlplan" and when you open up that file it will show a nice graphical query plan.  

SELECT * FROM sys.dm_exec_query_plan (<plan_handle>)  --here <plan_handle> is supplied based on the results from Query in Example 1.

3. sys.dm_exec_query_stats 

·         The most important DMV – sys.dm_exec_query_stats - returns aggregate performance statistics for cached query plans. The view contains one row per query statement within the cached plan, and the lifetime of the rows are tied to the plan itself. When a plan is removed from the cache, the corresponding rows are eliminated from this view.   Please note that the statement_start_offset and statement_end_offset columns are zero based, represents number of bytes and the end offset is -1 for last statement in the batch. In addition the text value returned by sys.dm_exec_sql_text is nvarchar(max) instead of text as documented incorrectly in Books Online.  Combined with sys_dm_exec_sql_text, it could provide rich information with all the performance related aggregates - IO (logical reads/writes, physical reads/writes), CPU (worker time), Duration (elapsed time), #of executions (execution_count).   

SELECT db_name(qt.dbid) as DatabaseName, object_name(qt.objectid,qt.dbid) as ObjectName,

Qt.text as 'statement',

 

substring(Qt.text, (qs.statement_start_offset/2)+1

                        , ((case qs.statement_end_offset

                              when -1 then datalength(Qt.text)

                              else qs.statement_end_offset

                           end - qs.statement_start_offset)/2) + 1) as 'Statement',

  QS.Total_Logical_Reads,

  QS.Total_Worker_Time,

  QS.Total_Elapsed_Time,

  QS.Execution_Count

FROM sys.dm_exec_query_stats