Doug Bonderud

May 9th 2023

Human Protein Design: AI Goes Toe-to-Toe With Evolution


Artificial intelligence (AI) in health care is moving into the mainstream. As noted by PwC, by leveraging the learning potential of AI, new technologies are now capable of early-stage disease detection and in-depth diagnostics and can even help healthcare professionals make better, more informed decisions.

But this is just the beginning. New efforts in AI have created tools capable of designing human protein that meets — or exceeds — the performance of those produced by evolution.

Getting Up to Speed: The Basics of AI Protein Design

Tapping tech to help with protein design is an established practice. For example, online protein folding game Foldit leveraged the power of collective human intellect to explore enzyme structures. In 2011, scientists gave Foldit players a challenge: Find the structure of a specific enzyme involved in HIV reproduction. In three weeks, online players cracked the code, solving a problem that researchers had been trying to resolve for years.

New human protein programs, meanwhile, are making use of next-generation AI to both replicate structures found in nature and think outside the biological box. As noted by ScienceDaily, getting AI up to speed started with researchers from the University of California, San Francisco, feeding the amino acid sequences of 280 million proteins into the machine learning (ML) algorithm that underpins the AI tool. Then, they gave it some time to think about what it learned and to start understanding the patterns and connections that make these proteins work.

Next, teams supplied the AI with 56,000 lysozyme family sequences along with some context around how these sequences are formed. Armed with this information, the tool generated one million potential protein sequences. One hundred were chosen to test, and of those, five were made into artificial proteins and used in cells.

The results were impressive. Two of the proteins produced were able to break down bacteria cell walls with a similar facility to hen egg white lysozymes (HEWLs) found in nature. Interestingly, while these two proteins delivered similar performance, only 18% of their structure overlapped. What’s more, scientists found that AI-generated options still offered some efficacy even when just 31.4% of their structures matched any known protein.

From Prediction to Production

Predicting the form of existing proteins based on pattern recognition is one thing, but what about creating entirely new proteins? According to the American Association for the Advancement of Science, there are two broad approaches to this goal: inpainting and constrained hallucination.

Inpainting uses AI solutions to fill in the blanks around a central feature. For example, an AI tool might be given part of a protein that binds well to specific antibodies. Equipped with only this information and its knowledge of protein patterns, AI can pinpoint missing pieces and slowly build out the structure around the central protein.

Constrained hallucination, meanwhile, lets AI run wild. Instead of proving a central protein feature, tools are simply given a goal, such as binding to carbon dioxide. AI solutions then generate novel proteins based on their understanding of component parts and interactions. Once a complete, virtual protein is imagined, the tool evaluates the potential efficacy of the outcome. It then keeps what works and mutates what doesn’t, getting closer to the goal each time.

In other words, this is evolution on AI steroids. Instead of waiting for the physical laboratory of natural environments to prove the efficacy of these proteins, AI tools can check millions of potential designs in a matter of days or weeks.

That said, this isn’t a science silver bullet. While AI-generated proteins offer massive potential, they don’t exist in isolation — instead, they’re part of larger biological systems that may trigger unanticipated reactions that change the way proteins work.

What’s Next for Human Protein Design?

If everything goes according to plan, the next step is using AI in health care to create new medical treatments for humans — or even using AI in vaccine development.

According to HealthITAnalytics, for example, work from Harvard and the University of Washington School of Medicine using both inpainting and hallucination techniques showed promising results. In one case, tools created novel proteins capable of bonding to the anti-cancer receptor PD-1. In another, proteins showed potential as the basis of vaccines for respiratory syncytial virus (RSV), which can be deadly to vulnerable groups.

Don’t expect designer protein drugs to appear anytime soon, however. While AI can hallucinate, create and test proteins ASAP, scientists can’t sidestep the rigor required to evaluate these potential pharmaceuticals for risk. In practice, this means more lab testing, controlled testing in animal analogs and limited human testing before any new treatments are available.

Despite the potential distance between the design and delivery of AI-produced proteins, the proof-of-concept makes a powerful statement. Given the data and the time, AI is capable of not just mimicking but in some cases outdoing evolution to build something we’ve never seen.

Are you interested in science and innovation? We are, too. Learn more about our people and life at Northrop Grumman, or check out our career opportunities to see how you can be a part of defining possible.