VMware HA stands for High Availability and over the past 4 years VMware HA has evolved significantly but now the lack of some features and limitation of existing ones are showing up in large architecture designs and could turn into a deployment show-stopper in the coming years or perhaps months if VMware doesn’t come with an enhanced version or eventually radically switches to a new HA agent. First I think a small introduction to VMware HA is necessary.
Where does that agent come from?
VMware HA is really based on a stripped down version of the Legato Automated Availability Manager 5.1.2, aka LAAM. When EMC took over Legato in 2003, the L got removed from the agent. Smart people like Deepak Narain developed the agent at Legato back in 2002.
What does that agent do?
VMware HA’s main job is to monitor ESX’s service console network interface card (NIC). Beside that main function,VMware HA provides high availability for virtual machines by pooling them and in the event of a failure, the virtual machines on a failed host are restarted on alternate hosts.
VMware HA does:
- protect against a server failure by automatically restarting the virtual machines on other hosts within the cluster.
- protect against operating system failures by continuously monitoring a virtual machine and resetting it in the event that a failure is detected.
What does happen in case of Failure Detection, Host Network Isolation and Operating System Failure?
Taken from vSphere Availability Guide, HA agents communicate with each other and monitor the liveness of the hosts in the cluster. This is done through the exchange of heartbeats, by default, every second. If a 15-second period elapses without the receipt of heartbeats from a host, and the host cannot be pinged, it is declared as failed. In the event of a host failure, the virtual machines running on that host are failed over, that is, restarted on the alternate hosts with the most available unreserved capacity (CPU and memory.)
Host network isolation occurs when a host is still running, but it can no longer communicate with other hosts in the cluster. With default settings, if a host stops receiving heartbeats from all other hosts in the cluster for more than 12 seconds, it attempts to ping its isolation addresses. If this also fails, the host declares itself as isolated from the network. When the isolated host’s network connection is not restored for 15 seconds or longer, the other hosts in the cluster treat it as failed and attempt to fail over its virtual machines. However, when an isolated host retains access to the shared storage it also retains the disk lock on virtual machine files. To avoid potential data corruption, VMFS disk locking prevents simultaneous write operations to the virtual machine disk files and attempts to fail over the isolated host’s virtual machines fail. By default, the isolated host leaves its virtual machines powered on, but you can change the host isolation response to Shut Down VM or Power Off VM.
To monitor operating system failures, VMware HA monitors heartbeat information provided by the VMware Tools package installed in each virtual machine in the VMware HA cluster. Failures are detected when no heartbeat is received from a given virtual machine within a user-specified time interval. The virtual machine is then restarted on alternate hosts.
Lack and Limitation, what are they?
-Taken from vSphere 4.0 Config Maximums, the following needs to be considered as limitations especially when thinking about the vCloud where you might require much more than 32 hosts in a HA cluster and much more than 40 guests per host:
- Hosts per HA cluster -> 32 max
- Virtual machines per host in HA cluster with 8 or fewer hosts -> 100 max
- Virtual machines per host in HA cluster with 8 or fewer hosts for vSphere 4.0 Update 1 -> 160 max
- Virtual machines per host in HA cluster with 9 or more hosts -> 40 max
-Next, VMware HA won’t by default protect you against the failure of the guest operating system and eventually if you turn that on, you gain some level of protection against the failure of the guest OS but you still you won’t get protection of the specific application within the guest operating system, unlike for instance an OS-level cluster. An advanced configuration setting, that is das.iostatsInterval, helps you avoid restarting a guest where the heartbeat ceased functioning by checking the guest’s IO activity for a certain period of time.
-Another point, VMware HA clusters spread over geographically dispersed Data Centers will be common designs sooner that we think and as Chad Sakac posted in response to a blog from Arnim Van Lieshout, HA needs:
- A more “SRM like” ability to control restart conditions/sequencing and
- A more transparent way to define primaries/secondaries.
-Finally, the current number of maximum 5 primaries per VMware HA cluster is just not enough to cope with large vCloud environments. For those who run Blades, one of the first recommendations is to avoid having those 5 primaries running on, for instance, a single Blade chassis also know as a ‘possible failure domain’. If the chassis goes down for any reason you lose your HA capabilities! There is a way to configure a HA node as primary or secondary, however it’s not possible to configure an ESX host as a “fixed” primary HA node. There is hope, back in September 2009, during VMworld 2009, Marc Sevigny from VMware revealed that a future release of HA would contain an option that would allow you to pick your primary hosts.
VMware HA is probably the feature with the most advanced settings as stated by Duncan Epping! Don’t get me wrong, I share Duncans’ thought, HA is awesome and many customers rely on it every day to get high availability across their VMware environments! But in my opinion HA has also reached its limits in many aspects and definitely needs to be improved and why not totally re-written to cope with tomorrow’s new challenging architecture designs of the vCloud initiative.
This is an open discussion where I through in my ideas and point of view. I welcome anybody to leave comments!