AI researchers at Microsoft have made a huge mistake.
According to a new report from cloud security company Wiz, the Microsoft AI research team accidentally leaked 38TB of the company's private data.
38 terabytes. That's a lot of data.
The exposed data included full backups of two employees' computers. These backups contained sensitive personal data, including passwords to Microsoft services, secret keys, and more than 30,000 internal Microsoft Teams messages from more than 350 Microsoft employees.
So, how did this happen? The report explains that Microsoft's AI team uploaded a bucket of training data containing open-source code and AI models for image recognition. Users who came across the Github repository were provided with a link from Azure, Microsoft's cloud storage service, in order to download the models.
One problem: The link that was provided by Microsoft's AI team gave visitors complete access to the entire Azure storage account. And not only could visitors view everything in the account, they could upload, overwrite, or delete files as well.
Wiz says that this occurred as a result of an Azure feature called Shared Access Signature (SAS) tokens, which is "a signed URL that grants access to Azure Storage data." The SAS token could have been set up with limitations to what file or files could be accessed. However, this particular link was configured with full access.
Adding to the potential issues, according to Wiz, is that it appears that this data has been exposed since 2020.
Wiz contacted Microsoft earlier this year, on June 22, to warn them about their discovery. Two days later, Microsoft invalidated the SAS token, closing up the issue. Microsoft carried out and completed an investigation into the potential impacts in August.
Microsoft provided TechCrunch with a statement, claiming “no customer data was exposed, and no other internal services were put at risk because of this issue.”