In-Guest Defragmentation – The Holy Grail For Best Performance?
In one of our internal mailing list, a question regarding in-guest defragmentation came up again. The right answer to this question is simple. It depends!
OK that’s fair, but it depends on what? Well mostly it depends on two things:
- what kind of storage device you store your virtual machines on. There are huge differences between a HP StorageWorks 2120 and an EMC Symmetrix V-MAX in terms of features and capabilities to improve the overall performances of the storage device.
- what kind of features you may leverage in your virtual environment. Features like thin provisioning, snapshot, data deduplication and replication either at the ESX host or storage device level.
You use enterprise class SAN/NAS devices. In such configuration with multiple hosts hitting the storage devices, it is very likely you have a random IO pattern. No worries, such storage devices are very smart and can deal with random IO pattern using techniques like IO coalescing, read-head mechanism, cache algorithm, RAID stripping, etc…
Storage devices such NetApp and its WAFL mechanism automatically fragments data to the disks. In a technical report called NetApp and VMware vSphere Storage Best Practices, page 78 it says:
Virtual machines stored on NetApp storage arrays should not use disk defragmentation utilities as the WAFL file system is designed to optimally place and access data at a level below the GOS file system
Leveraging features like thin provisioning, snapshot, data deduplication and replication may be impacted by an in-guest defragmentation.
The IO load generated by the defrag process running inside the virtual machine will negate the benefit of having thin disk by having the disk inflating. What’s the point of using thin disk if you bloat it with your in-guest defragmentation?
The same IO load will generate another huge amount of IO’s in the background and mess up with snapshots, growing them unexpectedly eventually reaching the same size as the parent disk, creating many SCSI locks as well, bloating latency. Why would you do that to your virtual machine?
If you’re doing replication to the other side of the earth, you may send a fair amount of extra bytes across your WAN link, eventually you end up saturating the link causing high latency and disconnections. You’re sure you want to get the network team on your back?
Dumb DAS devices. You have set up a virtual environment with no shared storage array, using one disk or maybe a few disks in a RAID configuration. In that case, in-guest defragmentation could be an improvement or I should say a mitigation of the overall performance degradation of your DAS. Scott Drummond of vpivot.com published an interesting article called Windows Guest Defragmentation, Take Two demoing the benefits of in-guest defragmentation in a very specific storage configuration, that is a DAS!
Again even though you’re using a dumb DAS device, leveraging features like thin provisioning may actually render in-guest defragmentation inappropriate for the same reason I have enlighten in scenario #1.
You may attach the DAS with a smart SCSI controller with plenty of cache and able to do IO coalescing but at the end of the day defragmenting is about writes, many writes, generating many IO’s that neither the controller’s cache or the physical disks in the DAS can absorb efficiently.
My thought is that in-guest defragmentation is, for most of the environments I worked in, just totally inappropriate and actually may decrease the overall performances. The overhead of running the defragmentation process is likely to be much more of a burden and outweigh the virtual benefit.
As usual it depends on YOUR environment and the way you’ve designed your storage, the features you have enabled, your functional requirements, etc…
In any case, and that is valid for both of my scenarios here, the easiest way to gain more performance out of your storage device is to align your VMs and your VMFS datastores to the storage device. VM alignment is critical, especially for Microsoft Windows prior Vista and 2008. Another way to improve IOs, is to disable in the guest the access time updates process in NTFS. And finally at the ESX host level, make sure your VMFS datastores are aligned by creating them through the vCenter Client.