In different projects, I have had to implement load balancing for multi-region deployments of Azure API Management. API Management with multi-region deployments, allows you to enable a built-in external load balancer. This means that public traffic is routed to a regional gateway based on the lowest latency without the need for additional configuration or the help of any other service.
However, it is quite common that organisations want to use API Management for both internal and external consumers, exposing just a subset of the APIs to the public internet. Azure Application Gateway enables those scenarios. Additionally, it provides a Web Application Firewall to protect APIs from malicious attacks. When organisations integrate API Management with Application Gateway, the built-in external load balancer can no longer be used.
The simplest way to implement load balancing across multiple regions for public traffic to API Management exposed via Application Gateway is to use Azure Traffic Manager. However, what to do when private traffic, for instance, coming from on-premises, must also be load-balanced to the multiple regions of API Management? According to this document:
For multi-region API Management deployments configured in internal virtual network mode, users are responsible for managing the load balancing across multiple regions, as they own the routing.
At the time of writing, Azure does not have a highly available service offering for load-balancing private HTTP traffic across regions with health probing capabilities. There is a feature request in the networking feedback forums that raises this gap. Thus, currently, there is no simple approach to implement this. Some months back, I raised an issue on this documentation page due to the lack of clarity on the topic.
In this post, I will explain how we can implement a highly available load balancing for both public and private HTTP Traffic to Azure API Management when integrated with Application Gateway and deployed on multiple regions. The approach I show here relies on having public health-probe endpoints exposed via Application Gateway.
Before I explain how this could be implemented, I would like to cover the approaches that were discarded and the rationale behind that. Even though they are certainly not feasible, it might be worth reviewing them here as these alternatives could be raised as part of a technical decision process. This section could potentially help you to save some time and avoid exploring alternatives with a dead-end.
There are some load balancing service offerings on Azure that target internal traffic. However, they could not be considered for the reasons outlined below.
If you want to read more about these options and how they compare to Traffic Manager, refer to this article.
Another option that was discarded was using the Windows DNS service. While this service can be geo-distributed and supports Application Load-Balancing with round-robin or basic weighted routing, it does not have health probing capabilities, so traffic could potentially be directed to unavailable or unhealthy endpoints.
Microsoft has documented two approaches for using Azure Traffic Manager to failover private endpoints on Azure with some limitations as summarised below.
Now, let us continue to discuss how to implement a highly available load balancing of public and private traffic to API Management across multiple regions.
If you are planning to implement this in your organisation, there are some prerequisites that you need to bear in mind.
Before we get into the technical details of how this approach works, I recommend you make sure that you understand the concepts below.
In this hypothetical scenario, we want HTTP requests to the domain api.pacodelacruz.io
to be directed to one of the available and healthy API Management endpoints. API Management is deployed in two regions and has an internal load balancer (ILB). Application Gateway is used to expose a subset of APIs to the public.
As mentioned above, a multi-region deployment of API Management premium provides a built-in external load balancer for public HTTP Traffic that routes requests to a regional gateway based on the lowest latency. However, you can implement custom routing and load balancing using Traffic Manager. Custom traffic routing is required when API Management is integrated with Application Gateway.
The conceptual architecture diagram below depicts the public traffic load-balancing scenario.
The flow of the public HTTP traffic would be as described in the sequence diagram shown below.
The public client’s flow can be summarised as follows:
api.pacodelacruz.io
api1.pacodelacruz.io
or api2.pacodelacruz.io
For the sequence described above to work, we need the configuration below:
Public DNS
Hostname |
Pointing to |
|
Azure Traffic Manager’s hostname, e.g. |
|
FQDN of the Application Gateway in the primary region, e.g. |
|
FQDN of the Application Gateway in the secondary region, e.g. |
Traffic Manager Profile
Setting |
Value |
Endpoint types |
external endpoint |
Endpoint 1 |
|
Endpoint 2 |
|
Endpoint monitor settings / Custom Header settings |
|
You might be wondering why we need api1.pacodelacruz.io
and api2.pacodelacruz.io
hostnames when we can route traffic from Traffic Manager directly to the Application Gateway hostnames. This will become evident when we see the configuration required for private traffic.
The Traffic Manager's endpoint monitor settings / custom header settings allows keeping the original hostname (api.pacodelacruz.io
) in the health probe requests while also using the existing TLS certificate without having to add unnecessary aliases.
Details on how to integrate App Gateway with an API Management deployed within a VNET are detailed in this article. App Gateway needs to use the API Management regional endpoints for the back end pool and health probing endpoints as described here. API Management, in turn, will need to route API calls to regional backend services as defined here.
Once we have the custom load balancing of public traffic set up via Traffic Manager, we can proceed to configure the private traffic load balancing. A conceptual architecture diagram is depicted in the figure below.
The flow of the private HTTP traffic is as depicted in the sequence diagram shown below.
The private client’s flow can be summarised as follows:
api.pacodelacruz.io
api1.pacodelacruz.io
or api2.pacodelacruz.io
For the sequence described above to work, we need the configuration below:
Private DNS
Hostname |
Pointing to |
|
Internal IP address of the primary regional API Management instance |
|
Internal IP address of the secondary regional API Management instance |
All the remaining setup was done when configuring the custom public traffic load balancing as described above.
As mentioned earlier, this approach for load-balancing private traffic only works if you have public health probe endpoints. When this is not the case, i.e. you have a multi-region deployment of Azure API Management with ILB only and this is not exposed to the public internet via Application Gateway, a different approach or workaround would be required.
One simple option is that you implement two sets of Azure Functions to expose the health status of internal endpoints to the public internet.
This way, the Traffic Manager could rely on the publicly exposed health probe endpoints without exposing internal APIs.
Throughout this post, I have described how you can implement load balancing of public and private traffic to an API Management deployed to multiple regions and exposed via Application Gateway. As you have seen, once you have API Management integrated with Application Gateway, implementing load-balancing for both public and private traffic, you just have to add a Traffic Manager profile and add certain records on both the public DNS registrar and the private DNS zone.
If you want to be able to load-balance private traffic only without exposing health probe endpoints to the public internet, consider casting your vote to this feature request, so that this is provided as a built-in networking feature on Azure.
I hope you have found this post useful!
Happy load balancing!
Cross-posted on Paco’s Blog
Follow Paco on @pacodelacruz