HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

Estimating IO Requirements for Bursty Usage

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
estimatingusageforburstyrequirements

Problem

We have an application that queries a SQL database periodically throughout the day. There are periods of zero or only light activity, interspersed with individual requests for relatively large amounts of data. When those requests come in, the primary objective is to deliver the data quickly, and the secondary objective is to do that cost-effectively. Due to the nature of the application, it is fairly unlikely that the data/indexes will have been cached in RAM from the previous query (different users, working on different portions of the data).

For a system that experiences relatively steady usage, I have heard the rule of thumb to observe the disk queue length and keep that number relatively small. This will specifically run in AWS, where I have seen the rule of thumb that a disk queue length of 1 per 100 IOPS is reasonable.

How can I estimate the IO requirements for such a system? Is the disk queue length a reliable indicator when dealing with individual, bursty queries? Are there other metrics that I should consider?

Solution

The primary metric I've always considered for IO in SQL Server is not IOPs or Disk Queue Length, but disk throughput (sec/reads and sec/writes). Overall, databases aren't about how many operations you can throw at a disk, but how quickly those operations are completed. The general rule of thumb is to have less than 20ms/operation (though lower is always better). More detail can be found in this article.

Disk Queue Length is a bogus stat and no longer relevant. The problem with it is that the value measures the queue for a single drive, but now that we live in an age of RAIDs, SANs, and other distributed storage, there's no way to properly translate this value to a meaningful number. A great starting place for performance metrics is this poster from Quest/Dell that gives you a lot of stuff and explanations for why or why not they are important. You don't have to use all of them, but they are a start.

In order to test your IO, you have to understand your workload at its peak. How many transactions and how much is cached? Unless you know and have measured these, it's really hard to judge. You can create work loads and use tools like SQLIO to test your storage, but you will need workload patterns in order to build a proper test.

Finally, a note on AWS: To my knowledge, Amazon will not guarantee IO performance in AWS. This is primarily because the storage is a massive shared resource and it's impossible to gauge the patterns of you and your neighbors on a particular area of storage (see the Noisy Neighbor problem).

My recommendation would be to allocate as much memory as possible. SQL Server will only push stuff out of memory if it's under pressure and space in the buffer pool (based on LRU-K). So if you're buffer pool can store most of the database in memory, you can mitigate some of the bursty performance. Also, consider tactics that can keep cache objects "warm". Finally, keep an eye on SQL 2014 and the new Hekaton feature.

Context

StackExchange Database Administrators Q#59736, answer score: 10

Revisions (0)

No revisions yet.