Code error for crashed Azure cloud services

Apparently, even the best programmers make mistakes. Microsoft this week attributed a crash in Azure cloud services on April 1st to a code bug.

The outage didn’t last long, as services like Dynamics 365 and Xbox Live typically came back online at around 5:30 a.m. EDT within an hour of the problems starting. Still, the Redmond, Washington software giant said the recovery time “exceeded our design goal.”

These and several other services crashed because Azure DNS had a service availability problem. According to Microsoft, customers could not resolve domain names for cloud services.

The root cause was described as follows:

Azure DNS servers experienced an unusual increase in DNS queries from around the world targeting a number of domains hosted on Azure. Typically, Azure’s cache and traffic shaping layers would mitigate this surge.


