When implementing, replatforming, upgrading, or simply making technology stack changes to an LOB application like SAP, PeopleSoft, Siebel, EBS, or Dynamics, special attention must be paid to configuring and optimizing the disk subsystem hosting the system’s data. After all, experience tells us that this is a common culprit in end-user performance problem scenarios. To forgo tuning and validating the disk subsystem post-change is to risk serious performance and availability issues. Yet many of the methods and tools commonly used for baselining disk performance are often tied to the application package being deployed or to the underlying RDBMS. Because of their complexity and the investment needed in terms of time, these methods often takes a back seat to other matters. What we REALLY need is a simple way to validate that a SAN or other disk subsystem can push a certain I/O load or support a particular throughput level. For example, if I know my newly deployed or updated SAN can move data at 1.2 GB/second, or process 10,000 64kb reads every second, I’d be really comfortable using that SAN to support an SAP ERP system that’s only expected to process 4,000 reads a second. So…..let’s talk about a easy-to-use tool and method of testing and validating disk subsystem performance that comes into play before the database and application are ever even introduced into the equation. The tool – Microsoft’s SQLIO.
What is SQLIO?
SQLIO is a tool provided for many years from Microsoft, developed long ago to help the SQL Server technical team characterize different computing platforms. As such, it’s described as “a disk workload generator that is designed to simulate some aspects of the I/O workload of Microsoft SQL Server.” The key here is “some,” of course, as simulating a true production database is much more complex than executing a command-line utility with a couple of switches enabled. But, as many of my very own enterprise customers know, Microsoft’s SQLIO utility is a worthwhile tool for a number of reasons:
The “older” standalone version of SQLIO was not generally available to the public; Microsoft’s Hardware and Services partners were authorized to use the tool provided it was not left in the hands of the customer. Another tool – SQLIOStress – was generally available to anyone through public download – but this tool was a bit more complex and focused more specifically on emulating MS SQL Server. HOWEVER, in 2005 a publicly available version of SQLIO (offered as an msi file) was introduced. You may download it from http://www.microsoft.com/downloads/en/details.aspx?familyid=9a8b005b-84e4-4f24-8d65-cb53442d9e19&displaylang=en. Keep in mind that while the remainder of this blog posting focuses on the older standalone version of SQLIO, the methods of using the tools are similar.
How to Baseline a Disk Subsystem Using SQLIO
Once the disk subsystem has been setup, changed, or provisioned online (in the case of Amazon S3, Azure, etc), the next step is to begin setting up the specific disk groups and LUNs as directed by the LOB or vendor Storage Architect. After the LUNs are carved up, they are typically presented (made accessible) to the LOB database server and other servers that will ultimately use the disk subsystem; afterwards, the database and application installations can begin. However, before the RDBMS and applications are installed, I recommend using SQLIO to baseline your disk subsystem:
The fact that SQLIO allows you to define the time your particular test run will execute makes it a much better alternative to its hard-coded counterparts that require an hour to run. SQLIO also allows you to control queue depth per thread; thus, you can easily execute a “smoke test” where a disk subsystem is truly saturated, or emulate a more realistic “average load test” where the disk queue might be 1, 2, 4, 8, etc (whatever is deemed appropriate) in front of each LUN.
For simple baseline testing or even “delta” testing between two disk subsystem configurations, I like to execute each of eight discrete SQLIO tests for 60 seconds at a time. The SQLIO configuration file is edited to run through sequential read, random read, sequential write, and random write 8kb blocksize operations (4 different tests). I then repeat the same set of tests for 64kb block sizes, for 8 tests total. The fact that only eight minutes of wall-clock time is consumed for these eight minutes of testing makes SQLIO a compelling tool set indeed (it’s important to remember that the first time the tests are executed, though, the “data” files against which the tests will run will actually be created, making the first test longer in duration than subsequent tests).
Analyzing SQLIO’s output is easy. Since the SQLIO test-run output is space-delimited, dump one or multiple test runs into an Excel spreadsheet for subsequent analysis. I typically set up different columns for specific disk configurations (i.e. iterations in tweaking the configuration’s disk controller setup, group configuration, number of spindles per group or LUN, and so on), OS configuration changes (i.e. changes to block sizes or the number of logical drives). I then dump the test run output underneath each vertical column, so as I scan horizontally across my spreadsheet I can observe changes in MB/second throughput and total I/Os processed per second. It’s a simple way to help reinforce that my changes to a particular disk subsystem or OS configuration are indeed resulting in adequate (hopefully improved!) disk throughput and performance.
At the end of the day, with my SQLIO output in hand it’s easy for me to prove to a customer that their particular disk subsystem is ready for prime-time. Sure, the actual database configuration or application itself might be horribly tuned (another discussion altogether!). When performance problems crop up and the customer is tempted to start finger-pointing, I’ll be able to keep them focused away from the disk and OS….and in cases where an LOB application is already live, SQLIO gives me the ability to determine if the disk system is configured well relative to the load it is supporting. Either way, the comparative value – through iterative tuning and testing, or by comparing my results to those of previous SQLIO baseline engagements – saves all of us a lot of time, and helps us focus our collective tuning efforts on the technology layers where that incremental effort may actually provide real value.
I've got some "Using SQLIO tips and tricks" that might be useful to some of our readers - let me know if that's interesting to you, and I'll get those posted as well.
Enjoy the long weekend!