How Quorum works in Windows Server 2012 R2 Failover Clusters

 

Although we hear word “quorum” pretty often in news, and in general it is usually very related to politics and parliaments, I’m sure that real ITPros have different feelings for this word :). Of course, I mean cluster quorum.

When you create a cluster, one of the most important things to care about is how to configure and maintain quorum. Previous Windows server versions (up to 2012 R2) were using quorum modes such as Node Majority, Node and Disk Majority, and Node and File Share Witness Majority. In Windows Server 2012 R2, these quorum modes are not used by default any more. Instead, Windows Server 2012 R2 introduces the concept of Dynamic Quorum. This feature provides the ability for a cluster to recalculate quorum in the event of node failure and still maintain working clustered roles, even when the number of voting nodes remaining in the cluster is less than 50 percent. This results in greater cluster availability and higher up time.

Also in Windows Server 2012 R2, this feature is enhanced additionally by introducing the concept of Dynamic Witness. When you configure a cluster in Windows Server 2012 R2, dynamic quorum is selected by default, but witness vote also is adjusted dynamically based on the number of voting nodes in the current cluster membership. For example, if a cluster has an odd number of votes, a quorum witness does not have a vote in the cluster. If the number of nodes is even, a quorum witness does have a vote. If a witness resource is for some reason failed or offline, the cluster will set the witness vote to a value of 0 automatically. By using this approach, the risk of a malfunctioned cluster because of a failing witness is greatly reduced. If you want to see if a witness has a vote, you can use Windows PowerShell and a new cluster property in the following cmdlet:

(Get-Cluster).WitnessDynamicWeight

A value of 0 indicates that the witness does not have a vote. A value of 1 indicates that the witness has a vote. The cluster can now decide whether to use the witness vote based on the number of voting nodes that are available in the cluster. A much simpler quorum configuration when you create a cluster is an additional benefit. Windows Server 2012 R2 will configure quorum witness automatically when you create a cluster.

Also, when you added or evict cluster nodes, you no longer have to adjust the quorum configuration manually. The cluster now automatically determines quorum management options and quorum witness.

But it’s not the only the witness that can have and lose vote in the quorum automatically. In Windows Server 2012 R2, this can also apply for nodes, by using Tie Breaker for 50% Node Split technology. The cluster can now adjust the running node’s vote status automatically to keep the total number of votes in the cluster at an odd number. For example, if you have a cluster with an even number of nodes and a file share witness, if the file share witness fails, the cluster uses dynamic witness functionality to remove the vote from file share witness automatically. However, because the cluster now has even number of votes, the cluster tie breaker picks a node randomly and remove it from the quorum vote to maintain an odd number of votes. If the nodes are distributed evenly in two sites, this helps to maintain cluster functionality in one site. In previous Windows Server versions, if both sites have an equal number of nodes and a file share witness fails, both sites stop the cluster.

If you want to avoid the node being picked randomly, you can use the LowerQuorumPriorityNodeID property to predetermine which node has its vote removed. You can set this property by using the following Windows PowerShell command, where "1" is the example node ID for a node in the site that you consider less critical:

(Get-Cluster).LowerQuorumPriorityNodeID = 1

However, this is not the only new feature that Microsoft provided for Failover Cluster. Force quorum resiliency provides additional support and flexibility to split brain syndrome cluster scenarios. This bad scenario happens when cluster breaks into subsets of cluster nodes that are not aware of each other. The cluster node subset that has a majority of votes will run while others are turned down. This scenario usually happens in multisite cluster deployments. If you want to start cluster nodes that do not have a majority, you can force quorum to start manually by using the /fq switch. So far, it’s all like before. However, in Windows Server 2012 R2, in such scenarios, the cluster will detect partitions in the cluster automatically as soon as connectivity between nodes is restored. The partition that was started by forcing a quorum is considered authoritative, and other nodes rejoin the cluster. When this happens, the cluster is brought back to a single view of membership. Before, like in Windows Server 2012, partitioned nodes without quorum were not started automatically, and administrator had to start them manually with the /pq switch. In Windows Server 2012 R2, both sides of the split cluster have a view of cluster membership, and they will reconcile automatically when connectivity is restored.

Work Folders in Windows Server 2012 R2

The Work Folders functionality in Server 2012 R2 represents a significant enhancement over current technologies for data synchronization and accessibility. It provides the benefits of cloud-based solutions but still gives administrators the ability to control the technology’s settings and manage users’ data. Work Folders can be very useful for mobile users, especially in a BYOD environment.

Recently, I wrote a deep dive article about this cool technology, and it is published on Windows IT Pro site. Check it out here!

Failover Clustering in Windows Server 2012 R2 – Tie Braker for 50% node split

Beside having ability to use Dynamic quorum for Failover clusters, clustering in Windows Server 2012 R2 is enhanced with one more very interesting functionality.
The cluster is now able to automatically adjust running node’s vote status in order to keep total number of votes in the cluster at odd number. This feature is called Tie breaker for 50% node split and it works together with dynamic witness functionality. Dynamic witness functionality is used to adjust the value of quorum witness vote. For example, if you have a cluster with even number of nodes and a file share witness, if the file share witness fails, cluster will use dynamic witness functionality to automatically remove the vote from file share witness.
However, since the cluster now has even number of votes, cluster tie breaker will randomly pick a node, and remove it quorum vote to maintain odd number of votes. If the nodes are evenly distributed in two sites, this will help to maintain cluster functional in one site. In previous Windows Server versions, if both sites have equal number of nodes and file share witness fails, both sites will stop the cluster.

If you want to avoid node being picked randomly you can use LowerQuorumPriorityNodeID property to predetermine which node will have its vote removed. You can set this property by using following Powershell command:

(Get-Cluster).LowerQuorumPriorityNodeID = 1

,where “1” is the example node ID for a node in the site that you consider less critical.
This will be very nice to use with DR scenarios.

Windows Server 2012 R2 Failover Cluster – Global Update Manager

Pretty interesting new feature is implemented in Windows Server 2012 R2 failover clustering that allows you to manage how cluster database is updated.

Service responsible for this is called Global Update Manager. This service is responsible for updating the cluster database. In Windows Server 2012, you were not able to configure how these updates work, but in Windows Server 2012 R2 it is possible that you  configure the mode of work for Global Update Manager.

Each time the state of cluster changes (for example, when cluster resource is offline) all nodes in the cluster must receive notification about the event, before the change is committed to the cluster database, by Global Update Manager.

In Windows Server 2012, Global Update Manager works in Majority (read and write) mode. In this mode, when change happens to the cluster, majority of cluster nodes must receive and process the update before it is committed to the database. When cluster node wants to read the database, cluster compares the latest timestamp from a majority of the running nodes, and uses the data with the latest timestamp.

In Windows Server 2012 R2, Global Update Manager can also work in All (write) and Local (read) mode. When working in this mode, all nodes in the cluster must receive and process the update before it is committed to the database. However, when the database read request is received, the cluster will read the data from the database copy stored locally. Since all roles received and processed the update, local cluster database copy can be considered as a relevant source of information.

Windows Server 2012 R2 also supports the third mode for Global Update Manager. This mode is Majority (write) and Local (read). In this mode majority of cluster nodes must receive and process the update before it is committed to the database. When the database read request is received, the cluster will read the data from the database copy stored locally.

In Windows Server 2012 R2, default setting for Hyper-V failover clusters is Majority (read and write). All other workloads in the clusters use All (write) and Local (read) mode. Majority (write) and Local (read) is not used by default for any workload.

Some tips for troubleshooting AD CS

In last few weeks I was troubleshooting some PKI deployments, based on Windows Server 2008 and 2012, so I decided to share some troubleshooting tips from the field.

In first case, customer deployed a Windows Server 2008 R2 Standard edition, and configure CA role on it. Since 2008 R2 supports creating and managing  of  certificate templates, there was no need to deploy Enterprise. However, attempt to install ForeFront Identity Manager 2010 R2 CA files failed, because FIM setup wizard was looking for Enterprise or Datacenter on CA. We decided to do online upgrade to Enterprise version by using dism tool and that went fine. However, from that point CA role was not able to see any custom certificate templates from AD DS, nor it was able to create new, although it was officially running Windows Server 2008 R2 Enterprise. Solution was to fix things by using ADSIEdit tool. I ran ADSIEdit and then connected to configuration partition of AD DS and opened CN=Configuration | CN=Services | CN=Public Key Services | CN=Enrollment Services. In this key, right click the problematic CA name and choose to open Properties. Switch to Attributes and look for flags attribute. For Enterprise CAs this attribute should have value 10. In my case, this value was 2. After changing this manually to 10, and restarting AD CS, everything was fine.

In second case, customer was having huge number of failed and pending requests on his CA, as a result of improperly configured autoenrollment. We are talking about 10000+ failed or pending requests. I had to clean up this mess, and I used fairly simple method to do this. If you execute this command:

certutil –deleterow 01/06/2013 Request,

as a result all pending and failed requests generated before June 1st 2013 will be deleted. Be aware however that this command can clean up around 2500 rows in one pass. If you have more requests to clean, command will throw an error after it’s done. Don’t worry about that, just re-run this same command few times, until all is cleaned up.

Similar, if you have large number of expired certificates in your Issued certificate store on CA, you can use similar command to clean them up. Execute:

certutil –deleterow 01/06/2013 Cert,

and all certificates expired up to June 1st 2013, will be deleted.

And if you need to delete some specific request, make sure that you find appropriate requestID and execute this :

certutil –deleterow RequestID.

After you clean up the mess on the CA, it’s a good idea to defrag the CA database. Same utility as for AD DS DB defrag is used, which is eseutil. Just run eseutil /d pathtoCAdbfile.

SAN in certificates–might be useful

From time to time, I found my self searching through my own blog site (the old one) for this information. So, if you ever need to configure Windows Server 2003 or 2008 to issue certificates with subject alternative names, you will need to execute following commands on CA computer:

certutil -setreg policy\EditFlags +EDITF_ATTRIBUTESUBJECTALTNAME2
net stop certsvc
net start certsvc

After this, your CA will be capable to issue certificates with SANs. You can do it by sending req file to CA, or by using web console. If you are using a web console, choose to perform advanced certificate request, and then in Attributes field enter alternative names in format :

san:dns=dns.name[&dns=dns.name]

For example: san:dns=exchange.domain.com&dns=autodiscover.domain.com