Microsoft Virtualization Infrastructure Best Practices – Are You Ready To Pay More For Less?

I was reading a recently published article at Microsoft Technet about how moved to a virtualized infrastructure. The article describes the why and how Microsoft deployed a virtualized infrastructure using in-house products such Microsoft Hyper-V and Windows 2008 R2. Eat its own dog food is popular these days. VMware and Citrix do that as well. Nothing wrong here but not all virtualization infrastructures are equal and some best practices can hurt companies’ credit…

What surprised me actually is a feature called ‘Maintenance mode‘. Microsoft describes it as: ‘a server to not be available for virtual machine deployments’. I was thinking this is the same feature as VMware’s Maintenance Mode. This mode helps you to evacuate VMs and for instance patch your hosts, replace a faulty hardware, apply config changes that requre a reboot, etc… Most of the time, a host is put in Maintenance Mode for a very short period of time, once back up you leave Maintenance Mode and the host participates to the cluster again, right?

I kept on reading on found this other description: ‘Maintenance mode enables operators to perform live migrations in a prescribed fashion rather than allowing virtual machines to be sent to more than one host within the cluster…’. If I get it right, Microsoft has a dedicated server host where to perform migration of VMs. Perhaps that’s because Microsoft doesn’t have a DRS alike feature…

Again I kept on reading: ‘Also, if a node fails, this feature enables faster recovery time because the virtual machines from the failed node quickly migrate to the maintenance-mode node’. Now I think I got it! In another words, it allows a host server to be set aside in Maintenance mode ‘activities’ and therefore becomes that cluster’s target for a passive quick migration if another node fails.

Microsoft’s best practice defines one host server in ‘Maintenance mode’ every 15 active host servers!

Reading further, I came across this paragraph: ‘In addition to the total number of nodes, an organization should consider the type of cluster being deployed when it is determining the number of maintenance modes to have. For example, if the cluster hosts databases or other systems that maintain state, a minimum of two maintenance-mode nodes provides better protection against unexpected downtime.’ That makes two host server in ‘Maintenance mode’ every 14 active host servers now. Personally I call this a huge waste of resources 😮

Let’s do the math, so one to two ‘Maintenance mode’ host server(s) in every 15 or 14 active host servers. If I transplant this ‘best practice’ to one of my biggest customer, he would have between 16 to 32 Blade servers up but doing nothing, just idle and waiting! That’s up to two fully populated HP C7000 enclosures sitting there and waiting. Microsoft, are you kidding me?

Hopefully my customer runs vSphere/vCenter with DRS/HA/vMotion and many many other features only VMware can offer today to SMB’s up to Enterprise class customers. Better resources utilization, higher ratio, fewer physical servers, small footprint, high availability and greener. VMware, what else?

About PiroNet

Didier Pironet is an independent blogger and freelancer with +15 years of IT industry experience. Didier is also a former VMware inc. employee where he specialised in Datacenter and Cloud Infrastructure products as well as Infrastructure, Operations and IT Business Management products. Didier is passionate about technologies and he is found to be a creative and a visionary thinker, expressing with passion and excitement, hopefully inspiring and enrolling people to innovation and change.
This entry was posted in Uncategorized. Bookmark the permalink.

4 Responses to Microsoft Virtualization Infrastructure Best Practices – Are You Ready To Pay More For Less?

  1. David says:


    Nice article. Have a read of this link to get clarity re the “passive/maintenance node” recommendations

    It seems to clarify the situation a bit more


  2. deinoscloud says:

    Hi David and thanks for commenting.

    I went through the link you supplied and it was very interesting.

    In either way you need to set a host server in maintenance mode or reserve spare capacity across your active server hosts to allow for placement of VMs if a single node fails in your cluster. For me it is still waste of resources, the spare hardware is idle and doing nothing.

    In the context of the document that describes the move to virtualization, Microsoft uses a server host in maintenance mode every 15 active server hosts and eventually recommends 2 server hosts in maintenance mode for critical clusters/VMs.

    Now regarding VMware, FYI in a HA cluster the maximum VMs per host is 160 if no more than 8 server hosts and only 40 starting at 9 server hosts… This is far below Hyper-V with 384 VMs max per server host with a max of 1000 VMs in total in a single cluster…


  3. The reason Microsoft has this “best practice” is that they can’t overcommit memory (yet). They can only overcommit CPU. So if you don’t leave 1 in N servers unused, you might not have enough memory resources for an “HA event” (VMware terminology). Remember: this is like having 100% reservation on all your VMs on vSphere. Memory guaranteed, which means your VM won’t run if you’re even 1MB short.
    You could get away with 1/N of your memory resources unused (across the servers), but you might still get in trouble that way (8 times 512MB on different servers still doesn’t let you run that 4GB VM).

    • deinoscloud says:

      Hi Bert, thanks for your comment.

      >8 times 512MB on different servers still doesn’t
      >let you run that 4GB VM

      That is so true! I expect this ‘best practice’ to change at the second Microsoft gets a proper set of memory overcommit techniques…

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s