Wednesday, May 29, 2024
ad
HomeNewsMicrosoft AI GitHub Repository Breach Exposes 38TB of Private Data

Microsoft AI GitHub Repository Breach Exposes 38TB of Private Data

The breach exposed sensitive information like secrets, keys, passwords, and over 30,000 internal Teams messages. 

Microsoft has taken immediate action to address a significant security incident that led to the exposure of a staggering 38 terabytes of private data. The breach was identified within the company’s AI GitHub repository and is believed to have occurred inadvertently during the publication of open-source training data, according to Wiz, a cybersecurity research team. 

This breach included a backup from the workstations of two former employees, containing sensitive information like secrets, keys, passwords, and over 30,000 internal Teams messages. 

The repository, named “robust-models-transfer,” has been made inaccessible. Before its takedown, it housed source code and machine learning models related to a 2020 research paper titled “Do Adversarially Robust ImageNet Models Transfer Better?”

Read More: Another Group of Writers Sues OpenAI over Copyright Infringement

Wiz’s report revealed that the breach resulted from an overly permissive Shared Access Signature (SAS) token, an Azure feature that facilitates data sharing in a challenging-to-track and revoke manner. Specifically, the repository’s README.md file inadvertently allowed developers to download models from an Azure Storage URL that also granted access to the entire storage account, exposing additional private data. 

To address this issue, Microsoft promptly revoked the SAS token and blocked external access to the storage account. The company’s investigation found no unauthorized exposure of customer data and confirmed that no other internal services were compromised. 

The company also identified a bug in its scanning system that led to the false flagging of the specific SAS URL in the repository. To enhance future security measures, Microsoft has expanded its secret scanning service to include SAS tokens with overly permissive settings. 

Subscribe to our newsletter

Subscribe and never miss out on such trending AI-related articles.

We will never sell your data

Join our WhatsApp Channel and Discord Server to be a part of an engaging community.

Sahil Pawar
Sahil Pawar
I am a graduate with a bachelor's degree in statistics, mathematics, and physics. I have been working as a content writer for almost 3 years and have written for a plethora of domains. Besides, I have a vested interest in fashion and music.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular