We've been receiving a lot of questions from members of the Azure community asking how we built and operate our high availability directory, especially our rest based GraphAPI.
A key challenge in operating a cloud service is keeping the service available at all times and across multiple geographies. One of the ways we meet this challenge by using an availability proxy which allows us to operate multiple instances of our service which are all kept in sync. For example, it lets us make all instances available for reads and updates but if one instance goes down, it's load is redirected and the others take over. When the downed instance becomes available, it then starts receiving all of the updates it missed and once it has, it rejoins the group. We use the proxy internally to manage configuration data and operate some experimental services. We are excited about it as it's useful and simple to use. We are making the source available under Apache 2.0 so service architects and developers can add it to their tool box for developing highly-available, geo-distributed services in Azure.
The proxy operates as a transparent layer between clients and a service, so it is possible to apply this technology without a complete rewrite. Internally, the proxy uses the PAXOS algorithm: