How Quorum works in Windows Server 2012 R2 Failover Clusters

 

Although we hear word “quorum” pretty often in news, and in general it is usually very related to politics and parliaments, I’m sure that real ITPros have different feelings for this word :). Of course, I mean cluster quorum.

When you create a cluster, one of the most important things to care about is how to configure and maintain quorum. Previous Windows server versions (up to 2012 R2) were using quorum modes such as Node Majority, Node and Disk Majority, and Node and File Share Witness Majority. In Windows Server 2012 R2, these quorum modes are not used by default any more. Instead, Windows Server 2012 R2 introduces the concept of Dynamic Quorum. This feature provides the ability for a cluster to recalculate quorum in the event of node failure and still maintain working clustered roles, even when the number of voting nodes remaining in the cluster is less than 50 percent. This results in greater cluster availability and higher up time.

Also in Windows Server 2012 R2, this feature is enhanced additionally by introducing the concept of Dynamic Witness. When you configure a cluster in Windows Server 2012 R2, dynamic quorum is selected by default, but witness vote also is adjusted dynamically based on the number of voting nodes in the current cluster membership. For example, if a cluster has an odd number of votes, a quorum witness does not have a vote in the cluster. If the number of nodes is even, a quorum witness does have a vote. If a witness resource is for some reason failed or offline, the cluster will set the witness vote to a value of 0 automatically. By using this approach, the risk of a malfunctioned cluster because of a failing witness is greatly reduced. If you want to see if a witness has a vote, you can use Windows PowerShell and a new cluster property in the following cmdlet:

(Get-Cluster).WitnessDynamicWeight

A value of 0 indicates that the witness does not have a vote. A value of 1 indicates that the witness has a vote. The cluster can now decide whether to use the witness vote based on the number of voting nodes that are available in the cluster. A much simpler quorum configuration when you create a cluster is an additional benefit. Windows Server 2012 R2 will configure quorum witness automatically when you create a cluster.

Also, when you added or evict cluster nodes, you no longer have to adjust the quorum configuration manually. The cluster now automatically determines quorum management options and quorum witness.

But it’s not the only the witness that can have and lose vote in the quorum automatically. In Windows Server 2012 R2, this can also apply for nodes, by using Tie Breaker for 50% Node Split technology. The cluster can now adjust the running node’s vote status automatically to keep the total number of votes in the cluster at an odd number. For example, if you have a cluster with an even number of nodes and a file share witness, if the file share witness fails, the cluster uses dynamic witness functionality to remove the vote from file share witness automatically. However, because the cluster now has even number of votes, the cluster tie breaker picks a node randomly and remove it from the quorum vote to maintain an odd number of votes. If the nodes are distributed evenly in two sites, this helps to maintain cluster functionality in one site. In previous Windows Server versions, if both sites have an equal number of nodes and a file share witness fails, both sites stop the cluster.

If you want to avoid the node being picked randomly, you can use the LowerQuorumPriorityNodeID property to predetermine which node has its vote removed. You can set this property by using the following Windows PowerShell command, where "1" is the example node ID for a node in the site that you consider less critical:

(Get-Cluster).LowerQuorumPriorityNodeID = 1

However, this is not the only new feature that Microsoft provided for Failover Cluster. Force quorum resiliency provides additional support and flexibility to split brain syndrome cluster scenarios. This bad scenario happens when cluster breaks into subsets of cluster nodes that are not aware of each other. The cluster node subset that has a majority of votes will run while others are turned down. This scenario usually happens in multisite cluster deployments. If you want to start cluster nodes that do not have a majority, you can force quorum to start manually by using the /fq switch. So far, it’s all like before. However, in Windows Server 2012 R2, in such scenarios, the cluster will detect partitions in the cluster automatically as soon as connectivity between nodes is restored. The partition that was started by forcing a quorum is considered authoritative, and other nodes rejoin the cluster. When this happens, the cluster is brought back to a single view of membership. Before, like in Windows Server 2012, partitioned nodes without quorum were not started automatically, and administrator had to start them manually with the /pq switch. In Windows Server 2012 R2, both sides of the split cluster have a view of cluster membership, and they will reconcile automatically when connectivity is restored.