Impact of Fusion-io based tempdb on reindex duration
Do you have tempdb on Fusion-io (or equivalent) technology? Is your reindexing job left on default settings (re-orgs at 5-15% and full rebuilds at @ 30%)? If so, then you might be interested in the findings in this post.
The benefits of flash memory as a replacement for spinning drives in order to boost SQL Server performance is well documented. However, I’m not sure if one particular benefit has been fully exploited. This post covers how we reduced the average duration of our overnight reindex job by 80% with the help of just one Fusion-io card (per node).
A bit of background: in order to give a creaking and very tempdb heavy SQL Server 2008 R2 system a bit of a lift we shifted tempdb onto Fusion-io cards. The system consists of a 2-node cluster running on a SAN. We did indeed see throughput increase, as well as an improvement in concurrency (as we switched on snapshot isolation as well) but some 2 months after the cards were installed another benefit has come to light which has turned perceived wisdom of reorganising versus rebuilding indexes on its head (disclaimer: for our system running on our hardware, at any rate).
I’ve been DBA-ing for so long now that the good old 5%/30% general advice for reorganising as opposed to rebuilding an index as documented in BOL is so deeply ingrained that I’ve hardly given it a second thought and just followed it automatically, only tweaking it when performance analysis showed an issue with these thresholds. For more background info on these thresholds I strongly advise reading Paul Randal’s post on where these guidelines came from.
With tempdb on a much faster IO subsystem than the application database the SORT_IN_TEMPDB option for the ALTER INDEX command suddenly comes into play. After installing the Fusions I tweaked our reindex job to turn the SORT_IN_TEMPDB option on, made a mental note to follow this up in a few days time, and promptly forgot about it until several weeks later when I wanted to track down indexes that were getting highly fragmented (we use Ola Hallengren’s excellent maintenance script which logs this kind of information (and more) in a table) and noticed some indexes had huge variations in the time taken to reindex them, with a full rebuild taking a fraction of the time a reorg did, and that’s when the penny dropped.
The Fusion-io boost
I promptly turned off reorgs so now it rebuilds at 10% with every index typically getting fully rebuilt in under 5 minutes (multiple indexes are between 10-14 GB in size). Needless to say, this has had a massive impact on the total reindex time. The reindex job on this system used to take anywhere between 4 to 9 hours which was a real problem as this would often encroach well into the business day.
Since we switched off the reorgs, the reindex has never taken more then 2 hours (and the longest run at 114 minutes included 45 minutes of blocking). The graph below shows the performance improvement very er…graphically:
Purdy, ain’t it? As we can see, February 16th was the first night the reindex job used the rebuild option in place of reorgs, and the drop off in the run time could not be more dramatic. For the two weeks from 1st February – 14th February (inclusive) the average duration was 361 minutes, or 6 hours. After the SORT_IN_TEMPDB tweak on the 16th, the average (from the 16th up until today (March 11th) has been just 78 minutes.
Incidentally, the Fusion-io cards in question are 785GB ioDrive2 Mono MLC PCIe Solid State Storage Cards, and yes, technically MS don’t support this tempdb configuration on a pre SQL 2012 cluster, but lots of people are doing it.
The benefits
For us, modifying the reindex job to drop the reorgs in favour of rebuilds on this particular system was a no-brainer. The reindexing finishing in a quarter of the time taken previously on a consistent basis every night has eliminated the uncertainty and inconvenience caused when the original approach caused the index to collide with subsequent maintenance jobs or encroach into the working day of the system.
Furthermore, the database was in simple recovery so we didn’t have to worry about transaction log backups (although a reorg can have a greater impact on transaction log backups).
This is either a benefit that’s so obvious that no-one has bothered blogging about it, or it’s something that’s gone under the radar thus far (I’m obviously hoping it’s the latter). I’ve seen posts hint at it by mentioning index creation is faster etc, and I’ve even read a MS whitepaper analysing tempdb on Fusion-io on a SQL 2008R2 system (which did a reindexing test with the SORT_IN_TEMPB option on, but left the reindexing thresholds on defaults) but no-one, as far as I can tell, has gone the whole hog with regard to adapting their approach to reindexing and just rebuilding indexes in favour of the reorg option to fully exploit their investment in Fusion-io cards (or they just kept quiet about it, whereas I couldn’t help blabbing about it!); I can think of a number of scenarios and previous systems where the improvement in reindex job performance alone would justify this investment.
Are reorgs redundant?
If you have your tempdb on Fusion-io or similar technology, with the application database(s) on a slower IO subsystem with the reindexing strategy left on original settings its an option well worth exploring.
The things to look out for are the bandwidth from host to SAN as on this system we do briefly get close to maxing it out during the reindex now as it reads in the index pages and writes out the rebuilt index. Personally, I’d rather fully exploit our infrastructure than have it burbling along at the fraction of its capacity, which is what it used to be like when most of the indexes were being reorganised.
Another consideration is the latency and throughput difference between the storage subsystem the application dbs are stored on and the Fusion-io cards; against spinning disk SAN based systems there’s no competition, the Fusion-ios will outperfrom the SAN in every category, but if you have e.g. SSDs already in the SAN the difference may not be that substantial, so the switch may not bring about such a dramatic boost. On this particular system, in our environment, there was a huge disparity between SAN performance and the Fusions which has made the switch so beneficial for us.
If alternative HA strategies like log-shopping have been implemented the impact will have to be measured as the costs may outweigh the benefits (thanks Paul!).
A final thing to bear in mind is that if you are fortunate enough to have some older/smaller Fusions lying around (perhaps you’ve had to swap them out as the application dbs have got bigger) it’s a great way of recycling these cards on other systems to give them a new lease of life.
So are reorgs redundant? If you have access to this kind of technology, it’s easy enough to prove on your system.
In summary
“It depends”, and “your mileage may vary” have become well worn cliches, but they are so true. Every system is different and there are a number of factors to consider, as (I hope) the post outlines, but the potential benefits warrant exploring this, if you have not already done so.
Related posts
Stick your Fusions in the right slot!
Tempdb configuration and performance