VMware High Availability Isolation Event – Hack The Eight Minutes Delay


Another excellent post from Duncan Epping about das.failuredetection and its ‘relation’ to das.maxvmrestartcount triggered this article.

So to recap let me copy/paste Duncan’s isolation response schema:

  • T+0 – Restart
  • T+2 – Restart retry 1
  • T+4 – Restart retry 2
  • T+8 – Restart retry 3
  • T+8 – Restart retry 4
  • T+8 – Restart retry 5
  • To this schema I will just add a couple of  notes:

    • For each of the five iteration, some randomness is added in order to smooth out the spikes of many simultaneous attempts to power on different VMs. This is between 1 to 10 seconds.
    • The delay is doubled at each iteration. It is not a “+2” operation.

    You have noticed that starting iteration #3, the VM restart delay is ‘stuck’ to 8 minutes, that is 480 seconds. Is there a way to change that value? Yes there is but see below the reminder…

    Reminder: this is an UNSUPPORTED hack that can lead to UNSTABLE system! Don’t try this on a production environment (don’t even think about it) and read the disclaimer below! I shall not be liable for any damages arising out! Now that I have scared you let’s dig in the theory.

    So now that you have been warned, let’s dig in…

    1. Go to your ESXi host console (SSH)
    2. And navigate to /opt/vmware/aam/ha
    3. vi vmwaremanager.pl
    4. Scroll down to line 37: “my $VM_RESTART_DELAY_MAX = 480; #8 min
    5. Change the value to whatever you want, i.e. 120
    6. Save and quit
    7. Restart the management agents: service mgmt-vmware restart and service vmware-vpxa restart
    8. Tests 🙂

    To learn more about the sub routine that calculate the number of seconds to delay the next VM restart, open up vmwaremanager.pl ans go to line 5199.
    By the way have you noticed the “my $CHECK_VMSTATE_INTERVAL = 15; #sec” ?
    I may come back to that variable in another post 😉

    DISCLAIMER. THIS INFORMATION IS PROVIDED TO YOU “AS IS” WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, WHETHER ORAL OR WRITTEN, EXPRESS OR IMPLIED. THE AUTHOR SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OR CONDITIONS OF MERCHANTABILITY, SATISFACTORY QUALITY, NON-INFRINGEMENT AND FITNESS FOR A PARTICULAR PURPOSE AND SHALL NOT BE LIABLE FOR ANY DAMAGES ARISING OUT OF OR IN CONNECTION WITH THE USE OF THIS CONTENT, INCLUDING DIRECT, INDIRECT, CONSEQUENTIAL DAMAGES, LOSS OF BUSINESS PROFITS OR SPECIAL DAMAGES, EVEN IF THE AUTHOR HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

    About PiroNet

    Didier Pironet is an independent blogger and freelancer with +15 years of IT industry experience. Didier is also a former VMware inc. employee where he specialised in Datacenter and Cloud Infrastructure products as well as Infrastructure, Operations and IT Business Management products. Didier is passionate about technologies and he is found to be a creative and a visionary thinker, expressing with passion and excitement, hopefully inspiring and enrolling people to innovation and change.
    This entry was posted in Uncategorized. Bookmark the permalink.

    Leave a comment