Load Balancing in Exchange Server 2016

Similar as when TMG was EOL, and people were asking how should they publish Exchange services now, now we have a similar question with load balancing of Exchange Server, since support for Windows Network Load Balancing is no more. Architectural changes in Exchange Server 2016 which resulted in having only one server role now (now even Client Access Server is integrated with Mailbox Server) makes impossible WNLB for usage. Reason is simple – you can’t use WNLB and Failover Clustering (required by DAG) on the same pair of machines. However, to be honest, even if this is not the case, WNLB is not a solution that I’d recommend for Exchange load balancing to anyone. It’s too old with too many issues for a critical service like Exchange.
So, obviously, we need an external solution for load balancing client access traffic. In most cases, we are actually talking about HTTPS based traffic, as most Exchange traffic (except for SMTP) is using this protocol.
With Exchange 2013 Microsoft announced that it is not required anymore to go with expensive Layer 7 load balancers for Exchange – you can also use cheaper and simpler Layer 4 balancer for client requests. What’s even better, there’s no need for session affinity anymore. This is because client access services now proxy the connection to any Mailbox Server, and it doesn’t really matter where client actually establish the session. This simplifies the requirements for Exchange load balancers even more.
Let’s see what is the real difference actually, and why should you care about this anyway. You might be surprised that main issue here is actually node health check. Luckily, Microsoft provided a very intelligent way to handle this even with cheap (or free) load balancers. As I’m currently writing a course on Exchange 2016, I’ll use some of my draft materials to clarify this.

Load balancers that work on Layer 4 are not aware of the actual traffic content being load balanced. The load balancer forwards the connection based on the IP address and port on which it received the client’s request and it has no knowledge of the target URL or request content. For example, load balancer on the Layer 4 does not recognize if a client is connecting with Outlook on the Web or with Exchange Active Sync as both connections are using the same port (443). Also, load balancers on the Layer 4 are not able to detect the actual functionality of the server node that is included in load balancing pool. For example, Layer 4 load balancer can detect if one of the servers from the pool is completely down, because it does not responds to PING, but it can’t detect if IIS service on that server is working or not. From the client access perspective, if IIS is not working on the server, it is almost the same as the server is done. However, it will not be marked down by Layer 4 load balancer in this case. Some Layer 4 load balancers can provide simple health check by testing availability of a specific virtual directory, such as /owa, but functionality of one virtual directory does not guarantee that others are also working fine.

Load balancers that work on the Layer 7 of OSI model are much more intelligent. Layer 7 load balancer is aware of the type of traffic passing through it. This type of load balancer can inspect the content of the traffic between the clients and the Exchange server. From this inspection, it gets that results and uses this information to make its forwarding decisions. For example, it can route traffic based on the virtual directory to which a client is trying to connect, such as /owa, /ecp or /mapi and it can use a different routing logic, depending on the URL the client is connecting to. When using a Layer 7 load balancer, you can also leverage the capabilities of Exchange Server 2016 Managed Availability feature. This built-in feature of Exchange monitors the critical components and services of Exchange server and based on results it can take actions.

Managed Availability uses Probes, Monitors and Responders as components that work together. These components test, detect, and try to resolve possible problems. Probe component is used first. It tries to gather information or execute a diagnostic tests for a specific Exchange component. After that a Monitor component is used to evaluate the results that Probe provides. Monitor uses the results information to make the decision whether the component is healthy or unhealthy. If a component is unhealthy, a Responder component can take measures to bring that failed component back to a healthy state. This can include service restart, database failover or, in some cases, server reboot.

If a critical server component is healthy, Managed Availability generates a web page named healthcheck.htm. You can find this web page under each virtual directory, for example, /owa/healthcheck.htm or /ecp/healthcheck.htm. If Managed Availability detects that server component is unhealthy, the health check web page is not available and a 403 error is returned. You can use this to point your load balancer to the health check web page for each critical service.

Layer 7 load balancer can use this to detect functionality of critical services, and based on that information decide if it will forward client connections to that node. If the load balancer health check receives a 200 status response from health check web page, then the service or protocol is up and running. If the load balancer receives a 403 status code, then it means that Managed Availability has marked that protocol instance down on the Mailbox server.

Although it might look that load balancer actually performs a simple health check against the server nodes in the pool, health check web page provides an information about workload’s health by taking into account multiple internal health check probes performed by Managed Availability.

It is highly recommended that you configure your load balancer to perform the node health check based on information provided by Managed Availability feature. If you don’t do this, then the load balancer could direct client access requests to a server that Managed Availability has marked unhealthy. At the end, this results in inconsistent management and negative user experience.