On January 29, Microsoft Cloud services including Microsoft Azure, Office 365, and Dynamics 365 suffered a major outage. This resulted in customers experiencing intermittent access to Office 365 and also deleting several database records. This comes just after a major outage that prevented Microsoft 365 users from accessing their emails for an entire day in Europe.
⚠️Engineers are aware of an issue affecting Role-Based Access Control and logins to the Azure Portal and are actively investigating. We will provide more information as it becomes available.
Users who were already logged into Microsoft services weren’t affected; however, those that were trying to log into new sessions were not able to do so.
How did this Microsoft Azure outage happen?
According to Microsoft, the preliminary reason behind this outage was a DNS issue with CenturyLink, an external DNS provider. Microsoft Azure’s status page read, “Engineers identified a DNS issue with an external DNS provider”. CenturyLink, in a statement, mentioned that their DNS services experienced disruption due to a software defect, which affected connectivity to a customer’s cloud resources.
Along with authentication issues, this outage also caused the deletion of users’ live data stored in Transparent Data Encryption (TDE) databases in Microsoft Azure. TDE databases encrypt information dynamically and decrypt them when customers access it. As the data is stored in encrypted form, it prevents intruders from accessing the database.
For encryption, many Azure users store their own encryption keys in Microsoft’s Key Vault encryption key management system. The deletion was triggered by a script that automatically drops TDE database tables when corresponding keys can no longer be accessed in the Key Vault.
Microsoft was able to restore the tables from a five-minute snapshot backup. But, those transactions that customers had processed within five minutes of the table drop were expected to raise a support ticket asking for the database copy.