Header image

Quantori blog

November 28, 2023

How pre-trained language models can transform Life Sciences

Omar Kantidze
Omar Kantidze
Principal Scientist​
Language models, pre-trained on both natural languages and molecular languages (DNA and proteins), are transforming the Life Sciences.

It's fair to say that generative artificial intelligence (AI) has defined the year 2023. Users are playing with and starting to utilize ChatGPT to solve their day-to-day problems, while businesses are trying to understand how generative AI can be applied to increase their efficiency. Being the most accessible and well-known to a wide range of users, GPT is nevertheless only the tip of the iceberg. Dozens of pre-trained language models (PLMs) have emerged over the last five years to solve various tasks. 

Without a doubt, pre-trained language models can change the game in the Life Sciences and Healthcare by automating many processes. From bringing GPT  — based assistants into patient care to creating FDA  — regulated electronic case reports from protocols, their applications are vast. Yet, the challenge is in figuring out the exact use cases where these models can provide groundbreaking insights.  

language models analyzing biological sequences

From this point of view, let’s take a closer look at the progress in biomedical language models trained on extensive volumes of medical and scientific texts. These specialized PLMs are becoming more skilled in various tasks, such as extracting clinical data, answering medical questions, predicting drug interactions, and so on. It is very likely that biomedical PLMs will soon become essential tools for researchers, clinicians, and professionals in drug discovery. 

There is also an exciting opportunity to train language models using molecular languages, such as DNA and protein sequences, instead of natural (human) languages. There are already successes in this area: several PLMs trained with DNA and protein sequences can perform tasks like predicting protein structure and properties, identifying DNA regulatory elements, defining functional genetic variants, and more. 

Currently, PLMs are revolutionizing the field of protein drug engineering as well. Specialized generative language models can create sequences of proteins that are not found in nature yet possess the requested properties. Investors are confident in startups that use AI to design new proteins. This confidence might trigger a rapid expansion of these technologies, facilitating the active integration of their products into clinical applications.  

To gain insights into the current landscape of biomedical, genomic, and protein language models, and to understand their journey, and the range of tasks they can address, check out Quantori’s new white paper, “Beyond words: the expanding role of language models in biology” by Dr. Omar Kantidze. 

Artificial Intelligence
Language Models

Do you have any thoughts or questions?

We are looking forward to discussing this article with you. Fill out this form or reach out to contact@quantori.com

This site is protected by reCAPTCHA Enterprise and the Google Privacy Policy and Terms of Service apply