Using Azure Traffic Manager to divert to a holding page for a site that's not available
Parts in this series:
In the first post in this series I put together a holding page that uses Azure Storage to write "call me back" details to Table Storage (so it doesn't rely on anything other than Azure), in the second post I published the page to Azure as an App Service and configured it so that it'd respond to custom DNS (i.e. the domains that would be covered by the holding page) without any service interruption to the existing site(s).
The next step is to hook up Azure Traffic Manager so that the process of failing over to the holding page (and failing back) is automatic. Without using Azure Traffic Manager, switching between the normal sites and the holding page means visiting my DNS provider and updating it so that the relevant domain names point either to the App Service in Azure, or the servers that host the sites that would ordinarily be returned. Using Azure Traffic Manager this becomes automatic.
What's Azure Traffic Manager and how does it work?
Azure Traffic Manager (cribbing from the overview page at azure.microsoft.com) is DNS-based load balancing that:
... operates at the DNS layer to quickly and efficiently direct incoming DNS requests based on the routing method of your choice. An example would be sending requests to the closest endpoints, improving the responsiveness of your applications
So put quite simply, Traffic Manager acts as an intermediary between your DNS provider and the rest of the internet, giving back the address (endpoint) that traffic should be routed to based on a decision it makes. As of the time of writing this decision can be based on one of six methods:
Each of the different ways that Traffic Manager can decide how to direct requests has its merits, but the one I'm using for failover purposes is Priority. This is described as:
Select Priority when you want to use a primary service endpoint for all traffic, and provide backups in case the primary or the backup endpoints are unavailable.
With Traffic Manager set up to use priority, the endpoint (server/service that traffic for a given domain can be sent towards - remember, this is via responding to DNS requests, not by tunnelling all data received and requested through Azure itself) returned will be the one that has the highest priority of the currently healthy endpoints. The health of endpoints is monitored by periodically making an HTTP(S) request to the endpoint and deeming it healthy when it returns a response where the HTTP Status Code falls within a range that the endpoint is configured to consider 'healthy', i.e. a response code of 500 probably wouldn't be one you'd put in the healthy list.
Configuring Azure Traffic Manager
The first thing to setup in Azure is a Traffic Manager Profile, this doesn't impact any service availability and only takes a few minutes. Head over to the Azure Portal, choose + Create a resource at the top-left, search for Traffic Manager and click through it in the search results:
Azure will open a pane containing some blurb about Traffic Manager with a Create button at the bottom, click that to get started. There are only two settings required that are specific to Traffic Manager, Name and Routing method. The name will form part of the CNAME for the Traffic Manager profile that will later get plugged into your DNS - choose whatever you want here, unless someone goes spelunking in your DNS, it's not going to be seen. The Routing method is one of the six methods I mentioned earlier, so I'm going to choose Priority:
With those set, clicking Create sends Azure off to start creating the profile (this can take a few minutes). Whilst that's happening I'm going to describe some DNS configuration changes that need to be made.
Setting up DNS to work with Traffic Manager + App Services
In order to have a mixture of App Service and external endpoints in a Traffic Manager profile, the external endpoints (your servers) need to be setup as CNAMEs. That means that if contact.robertwray.co.uk is currently an A record that points at 127.0.0.1 (I'm using localhost as a placeholder IP here) it won't work with Traffic Manager as you can't plug in the IP address for Traffic Manager (because there isn't really one), nor can you take that IP address and plug it into Azure as an endpoint, because if you do you'll see this:
THe thing to do is to create a new A record that points to your IP address (e.g. ip-127-0-01-1.robertwray.co.uk pointing to 127.0.0.1) and replace the current A record for contact.robertwray.co.uk with a CNAME that points to ip-127-0-0-1.robertwray.co.uk.
Getting this bit of DNS setup, and the indirection for the underlying IP address, means that switching over to Traffic Manager later is a little bit easier.
Once Azure has finished creating the Traffic Manager profile, load it up and click on Endpoints under Settings which will bring up an empty list of endpoints ready for them to be added. There's an + Add button just above the list and that's where we want to click on next to create the two endpoints we want, one External endpoint pointing at the record in DNS that points to our IP address and an Azure endpoint pointing at the App Service that's going to be used when it's not available.
First up, click the add button and enter the details of the external endpoint. Choose External endpoint for the Type, give it a sensible Name (I'm going to call it contact.robertwray.local), enter the DNS name for the IP address in the Fully-qualified domain name (FQDN) or IP, so ip-127-0-0-1.robertwray.co.uk and assign a Priority, which I'm going to leave set to 1. The last field, which is optional is Custom Header settings which comes in handy if you're hosting multiple sites behind the IP address. I'm going to specify host:contact.robertwray.co.uk which means that when Traffic Manager performs health monitoring it'll specify a host header ensuring that it hits the correct site. There's also a Add as disabled checkbox so that you can add new endpoints to a Traffic Manager profile that's already in service without them being used. Once all that's filled in, click OK and Azure will validate and then create the new endpoint.
Next up is creating the endpoint that points to the App Service, click on the add button again and choose Azure Endpoint for the Type, again give it a sensible Name (I'm going to call it Holding Page) and select App Service for Target resource type. Once that's done the Target resource field underneath will change to say Choose an app service, click on this and you'll be shown a list of all your available App Services. I'm going to click on holding-page-test and leave the Priority and Custom Header settings as they are:
Click OK and again Azure will validate and then create the new endpoint, which means that the Traffic Manager profile is now complete with an endpoint for the "real" service and an endpoint for the holding page App Service in Azure.
The reason I set them up in the order I did was to have them acquire their Priority by default. When a profile is using priority as its routing method, all traffic is sent to the endpoint with the lowest priority that's currently healthy, meaning that when the real service is healthy it'll take all traffic as its priority is 1 and when it isn't, the Azure App Service will take all traffic as its priority is 2. You could set them up in any order and then adjust the priorities, but it's easier to get it right from the start.
Once Azure has finished configuring everything, it'll probably look like this:
Click on Configuration under Settings to pull up the settings that are used to drive how Azure monitors endpoints for this Traffic Manager profile. Once it's loaded you can see under Endpoint monitor settings that the endpoints are being monitored by calling them on port 80 over HTTP and that Azure is expecting to receive an HTTP 200 response when it does so (from the Port, Protocol and Expected Status Code Ranges settings respectively). As HTTP requests to pretty much any site on the web now will result in a redirect, there's little chance of getting a 200 response so go ahead and change this to HTTPS over port 443 and click on Save. Now click on Overview in the sidebar to go back to the page that I've shown above. Here's how it's looking for me right now:
You can see that one of the endpoints is listed as Degraded, and it's the one with the highest priority so right now any requests routed through Traffic Manager would result in DNS responses directed towards the Holding Page endpoint. This shows exactly what would happen when the real service is unavailable! I've tweaked the configuration for the site behind the endpoint and now it's healthy:
So now, all requests will get directed to the external endpoint at contact.robertwray.co.uk.
The last piece of the puzzle is updating my external DNS so that requests to contact.robertwray.co.uk get handled by Azure Traffic Manager. Thanks to the changes to DNS I made earlier (making contact.robertwray.co.uk a CNAME pointing to the intermediary ip-127-0-0-1.robertwray.co.uk entry) this is as simple as updating DNS so that it points to holding-page-test.traffingmanager.net. With that done, Traffic Manager will now respond to DNS requests for contact.robertwray.co.uk and direct requests to either my site or the Holding Page - with my site being preferred. Job done!
Using the fantastic mxtoolbox.com, here's the result of disabling the External endpoint (to simulate it being not available):
Before I disabled it, Traffic Manager routed requests to the external endpoint and afterwards to the holding page Azure App Service! Note the nice short TTL that's being given there by Azure, that keeps the amount of time that an unresponsive endpoint is being routed to as low as possible, though you can tweak this up or down in the Configuration for your Traffic Manager Profile.
That's all there is to it; tweak DNS to prepare for using Traffic Manager, create the Traffic Manager profile, plug it into DNS, done!