Story at a glance
- Artificial intelligence technology can analyze large amounts of data and look for patterns.
- Scientists are using machine learning, a type of artificial intelligence, to look for a ‘language’ for proteins.
- This research could help researchers understand diseases like cancer, Alzheimer’s and neurodegenerative diseases.
Artificial intelligence is being used to try to crack open all kinds of problems. Some experts think that techniques used to predict what types of movies or TV shows someone will like or what word will come next in a sentence could be applied to biology. A group of researchers is hoping to use algorithms and language processing to find mistakes in cells that are causing disease, like cancer, Alzheimer’s disease and neurodegenerative disorders.
A team based at St John's College, University of Cambridge think that machine learning technology can be used to find a kind of “biological language” for disease in the body. “Bringing machine-learning technology into research into neurodegenerative diseases and cancer is an absolute game-changer,” says Tuomas Knowles, one of the authors of the paper and a Fellow at St John's College, in a press release. “Ultimately, the aim will be to use artificial intelligence to develop targeted drugs to dramatically ease symptoms or to prevent dementia happening at all.”
In a paper published in PNAS, the researchers detail how they approached this. “The human body is home to thousands and thousands of proteins and scientists don't yet know the function of many of them. We asked a neural network based language model to learn the language of proteins,” says Kadi Liis Saar, who is the first author of the paper and a research fellow at St John's College, in the press release. They are concerned about protein condensates, where several proteins merge, and how it can affect gene expression and may be related to how a cell becomes diseased.
The goal was to see if the neural network could parse out patterns in the protein sequences that could help the researchers understand the proteins and see if there is a structure similar to grammar for languages. “We fed the algorithm all of data held on the known proteins so it could learn and predict the language of proteins in the same way these models learn about human language and how WhatsApp knows how to suggest words for you to use,” says Saar. “Then we were able [to] ask it about the specific grammar that leads only some proteins to form condensates inside cells.”
The benefit to using artificial intelligence is that it can process large amounts of data and find patterns that are hard to discern as a human looking at the data. Then, using the results from that analysis, if scientists can figure out what the language of the proteins is, then that can help them to figure out what is going wrong in diseased cells based on how the proteins are expressed in it.
Other researchers are invited to input protein sequences to get processed by the network that the team developed. In the future, this kind of analysis could help researchers who are studying cancer and other diseases. Saar adds, “It is a very challenging problem and unlocking it will help us learn the rules of the language of disease.”
READ MORE STORIES FROM CHANGING AMERICA