Thursday, September 23, 2021
HomeDeveloperGoogle’s DeepMind Open Sources 3D Structures of all Proteins

Google’s DeepMind Open Sources 3D Structures of all Proteins

All the data provided is freely available for academic and commercial use under Creative Commons Attribution 4.0 (CC-BY 4.0) license.

Google’s DeepMind partnered with the European molecular biology laboratory’s European bioinformatics institute (EMBL-EBI) for developing an AI system called ‘AlphaFold’ to predict the three-dimensional structure of a protein by recognizing the sequence of amino acids. 

Proteins are extremely complex substances providing structure to cells and organisms. However, they differ from one another primarily by the sequence of amino acids resulting in protein folding. 

Protein folding is a physical process where a protein chain arranges to a unique three-dimensional structure. In 1958, Sir John Kendrew and his co-workers came up with a low-resolution 3D structure of the protein. This research had led many scientists to demystify the hidden structures inherited by proteins.

DeepMind tops the list of critical assessments of techniques for protein structure prediction (CASP-14) with high accuracy. The AlphaFold database and source code are freely available to the scientific community, this contribution will aid many advanced biological research.

There is a greater scope in the medical field to understand the viral process and the emerging mutations through a deep learning model for structure prediction. This novel machine learning combines physical and biological knowledge of protein structure to leverage multi-sequence alignments to design a deep learning model.

“This will be one of the most important datasets since the mapping of the Human Genome,” said EMBL Deputy Director-General, and EMBL-EBI Director Ewan Birney. AlphaFold has provided 20,000 proteins of the human genome along with proteomes of 20 other biological organisms summing to 3,50,000 protein structures.

All the data provided is freely available for academic and commercial use under Creative Commons Attribution 4.0 (CC-BY 4.0) license terms. In the coming months, this organization plans to expand the database to cover a large proportion (almost 100 million structures) of all cataloged proteins.

Subscribe to our newsletter

Subscribe and never miss out on such trending AI-related articles.

We will never sell your data

Join our Telegram and WhatsApp group to be a part of an engaging community.

Amit Kulkarni
Engineer | Academician | Data Science | Technical Writer Interested in ML Algorithms, Artificial Intelligence, and the implementation of new technology.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular