On my path to better understand File Systems, here is a blog post regarding VMFS and one of its feature called DLM. Note that I’m not a storage specialist and here below is my understanding of the documents I’ve read. Do not hesitate to chime in as my interpretations may be lacking some information or eventually may be just wrong!
What is VMFS?
From Wikipedia.org, VMFS is VMware’s cluster file system where multiple servers (hosts) can read/write the same file system (datastore) simultaneously, while individual virtual machine files are locked…
So we have a per-file locking mechanism. There is another one called meta data locking mechanism implemented with the SCSI Reservations method. The later only happens when the datastore (LUN) meta data is updated.
The VMware® vStorage Virtual Machine File System Technical Overview and Best Practices talks about the Distributed Lock Management (aka Distributed Lock Server or Distributed Lock Service).
In the picture above, a classic Vmware Cluster environment, LUN1 is a clustered volume, and the VMFS driver provides the Distributed Lock Management that arbitrates access, allowing ESX Servers to share the clustered pool of storage.
What is Distributed Lock Management?
Again from Wikipedia.org, DLM (Distributed Lock Manager), provides distributed software applications with a means to synchronize their accesses to shared resources. The key here is the distributed aspect of the locking method. DLM runs in each cluster node (host) and the lock management is distributed across all hosts in the cluster.
On the opposite, in a non-distributed lock method, you would have a dedicated node in the cluster responsible for the locking. You see immediately the problem in this case. It’s a single point of failure.
VMFS is also known as a symmetric shared disk file system where metadata is distributed among the nodes as opposed to an asymmetric with centralized metadata servers as i.e. EMC Celerra HighRoad.
There are other shared disk file systems using the Distributed Lock Management:
Sanbolic offers a comparison of the various shared file system methods and products currently available on the market in a excellent post called Cloud Application of Shared File System Technologies.
Comparing VMFS to Conventional File Systems
Again from the VMware® vStorage Virtual Machine File System Technical Overview and Best Practices, we have some more hints about VMFS:
- Conventional file systems (CFS) allow only one server to have read/write access to a specific file at a given time. In contrast, VMFS is a CFS that leverages shared storage to allow multiple instances of ESX Server to have concurrent read and write access to the same storage resources.
- VMFS also has distributed journaling of changes to the VMFS metadata to enable fast and resilient recovery across these multiple ESX Server clusters.
VMFS does not have every feature found today in other CFS and CVM systems. However, there is no other CFS or CVM that provides the capabilities of VMFS. Its distributed locking methods forge the link between the VM and the underlying storage resources in a manner that no other CFS or CVM can equal. The unique capabilities of VMFS enable VMs to join a VMware cluster seamlessly, with no management overhead.
The VMFS driver is one of the best piece of software VMware developed but does it scale out in a cloud environment?. Remember that the ‘recommended’ limit is still 32 hosts in a VMware Cluster. I can’t believe it is a hard limit, same for datastore maximum size of 2TB. Will VMFS v5 bring us major enhancement in this area?
Before I finish my post here, I wanted to talk about an excellent article of Duncan Epping at Yellow-bricks.com about VMFS format and VMFS driver and their respective abilities.