Microsoft Premier Support (PFE) Latin America

Este Blog está dedicado a todo aquel interesado en tecnología Microsoft, y con deseos de aprender de la experiencia y vivencias de los PFES de Latinoamerica y del grupo de Incubation Support & Services (ISS)

Storage Area Network (SAN) for DBA's

Storage Area Network (SAN) for DBA's

  • Comments 5
  • Likes

During a chat with friends about getting a better performance of SQL Server, the following question emerge... which other optimization apply to SQL server, apart from Best Practices at the Operating System and Application level? Optimizations at the hardware level are also necessary, especially at the Storage level.

With this, we begin to explain the issues of SAN but focusing it to Database Administrators.

There are many factors that affect the proper performance of I / O disks, factors such as shared or dedicated disks, RAID levels, bus speed, HBA configurations; an important point that is often overlooked is alignment in the disk partitions.

Some of the non-disk factors which can present bottlenecks are:

-          HBA (Host Bus Adapters) or network interface cards

-          Processing capacity of FC (Fibre Channel) or iSCSI ports

-          Switch

-          Number of paths between the server and storage.

It is important to consider and understand the limits of the components involved in the architecture, some of the limits that are established in the market are:

Component

Limits

 

SAS cable

Theoretical: 1.5 gigabytes per second (GB/s)

Typical: 1.2 GB/s

PCI-X v1 bus

X4 slot: 750 MB/s

X8 slot: 1.5 GB/s

X16 slot: roughly 3 GB/s

PCI-X v2 bus

X4 slot: 1.5 – 1.8 GB/s

X8 slot: 3 GB/s

Note: Be aware that a PCI-X bus can be v2 compliant but still run at v1 speeds.

Fibre Channel HBAs, switch ports and front end Fibre Channel array ports

4 gigabits per second (Gbps): 360-400 MB/s

8 Gbps: Double the speed of 4 Gbps

Note: Make sure to consider the limits of the PCI bus. An 8-Gbps card requires a PCI-X4 v2 slot or faster.

But how all this is linked with SQL Server? The fundamental unit of data storage in SQL Server is the page (8KB), the space allocated to a data file (mdf or ndf) of a database is logically divided into pages numbered continuously from 0 to n. The operation of I / O disk is provided on a page.

The hard drives are composed of thin plates, circular disks on the surface of which are electronic media that store information. Each side of each plate has thousands of tracks, the set of tracks with the same diameter of the entire surface of the plate form a cylinder (For new devices the cylinder concept is not relevant, since they are not arranged in concentric circles, however it is helpful to understand the origin of terms). Each disk surface has a head dedicated I / O. The tracks are divided into sectors. A sector is the minimum fragment of data that can be read from or write to a hard disk. The new units can offer 1KB sectors, 3KB or 4 KB.

With RAID technology, data is striped across an array of physical disks. This data distribution scheme complements the way that the operating system requests data. The granularity at which data is stored on one disk of the array before subsequent data is stored on the next disk of the array is called the stripe-unit size. The collection of stripe units, from the first disk of the array to the last disk of the array, is called a stripe

But with all this that is the alignment of the partitions? That the effect on the performance of SQL Server? The diagram explains it... The vertical black dotted lines overlaying a red gradient correspond to the boundary between separate physical disks of a RAID group. Versions of Windows preceding Windows Server 2008 displayed the 63 reserved sectors reported by the disk hardware, immediately after which the remainder of the partition was exposed to the user. In these figures, the common sector size of 512 bytes is used, the stripe unit size in the example is 64 KB, And the default 4 KB NTFS cluster size, appropriate for, say, a file server. In the absence of disk alignment forces the eighth 4-KB cluster of user data to be laid across two stripe units, starting with the last empty 512-byte sector of the first stripe unit and continuing on to the second stripe unit. In this default configuration, every eighth cluster is written across two stripe units. This situation is perpetuated throughout the rest of the partition; because every nth cluster crosses stripe unit boundaries, two I/Os are required to perform a read or write.

The file allocation unit size (cluster size) recommended for SQL Server is 64 KB; the preferred 64 KB cluster size for SQL Server combined with the default partition misalignment forces the user data to be laid across two stripe units, starting with the first available sectors of the first stripe unit and continuing on to the second stripe unit. This situation is perpetuated throughout the rest of the partition. In this configuration, every subsequent cluster is written across two tracks or two stripe units—each and every read and write is affected.

The following is an experiment in which we examined the performance of disk, the test was Windows Server 2003 and SQL Server 2005. Executed a query to extract information from SQL Server, for each run DBCC DROPCLEANBUFFERS applied to clean the SQL Server buffer cache with this get satisfactory evidence. Performance counters that were monitored were counter Avg. Disk Transfer/sec objects LogicalDisk and PhysicalDisk objects which are used to measure the latency of the disk. The results show significant improvement compared discs alignment; metrics show improvement greater than 30% between latency and duration.

For review of correct alignment of disks, you should be clear about several concepts:

Starting Partition Offset: is the starting offset in the partition of the volume to ensure the default mapping n data pages in hidden sectors in the early volumes.

Stripe Unit Size: The granularity at which data is stored on one disk of the array before subsequent data is stored on the next disk of the array. This value is provided by your SAN administrator

File Allocation Unit Size: Also known as cluster size, represents the smallest amount of disk space which can be allocated to hold a file and is determined when the partition is formatted by the operating system.

There are two rules, which hold for correct alignment of disks. The results of the following calculations should be an integer value.

Partition_Offset / Stripe_Unit_Size

Stripe_Unit_Size / File_Allocation_Unit_size

Of the two previous rules, the first is the most important for optimal performance.

Examples of Scenarios with misalignment on disks

Given a partition with Partition Starting Offset 32,256 bytes (31.5 Mb) and a Stripe Unit Size size of 65,536 bytes (64 Kb), the result for Partition_Offset ÷  Stripe_Unit_Size is 0.4921875, which is not an integer, so considering that the disks are not aligned.

Example of Scenario with alignment on disks

Given a partition with Partition Starting Offset of 1048576 bytes (1Kb) and with a size of Stripe Unit Size of 65,536 bytes (64 Kb), the result is exactly Partition_Offset÷ Stripe_Unit_Size 8, an exact integer, so we can say that alignment is correct.

After the above is explained, the question we need answered is ... how can I see the current status of the disks?

To review the Starting Offset on a basic disk, we can use the command WMIC (Windows Management Instrumentation Command-line) to obtain data on the disk, the syntax is:

wmic partition get BlockSize, BootPartition, DeviceID, DiskIndex, HiddenSectors, NumberOfBlocks,  StartingOffset, Name, Index

The value to check with the previous command is StartingOffset, in this example is aligned to 1024 kb.

You can also get the Starting Offset via DISKPART, selecting the drive to analyze and execute the list partition command, which will get the offset.

To review the Starting Offset of a dynamic disk you will need to use the dmdiag utility with the –v switch

If you want to review the File_Allocation_Unit_Size can run the command

fsutil fsinfo ntfsinfo [drive]

The following example shows the execution of all commands above written.

Summary

Many factors contribute to an optimal disk performance. For disk partitions created with Windows Server 2003, is crucial to validate the correct correlation between the Stripe Unit Size and File Allocation Unit Size as best practice. Windows Server 2008 creates partitions aligned by default. When the servers are updated Windows Server 2003 to Windows Server 2008, existing partitions are not aligned automatically, you should rebuild them for optimal performance.

 

 

Comments
  • Excelent Article !!! Just what I was looking for. Thanks a lot.

    I have just one question, I understand that the stripe unit size is generally set when the LUN is configurated, but I would like to know the right way to calculate this and what values should I know to doing this.

    Our SAN is a CX4-120 but I can find any info about this parameter.

    Thanks again,

    Rafael Donado

    MCITP Database Administrator 2008

    Carvajal Información

    Bogotá, Colombia

  • For the stripe size there is a direct correlation with the size of the IO request. If the IO request is bigger than the stripe size it will have to seek across another disk to satisfy the data request. If the IO request size is very small and random you may loose some IO performance if the requests pile up on one disk causing a hot spot. On SQL Server the general recommendation is for 2000/2005 a 64k or 128k stripe size and for SQL Server 2008 a 256k stripe size. Another good tool is the Disk Drive RAID Configuration Tool it could help you some how to do estimations for your RAID Volumes. Another good toll is SQLIOSim, you can use it after configuring your volumes, it can help you to have an idea of the performance of the disk for the different types of access that SQL does.  

    technet.microsoft.com/.../Cc966414

  • interesting read.

  • Glad you like it!

  • "In the absence of disk alignment forces the eighth 4-KB cluster of user data to be laid across two stripe units" WHY?

      128-63=65sectors=32.5kb, 4kb*8=32kb.so i think the 9th cluster will lay across two stripe units.

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment