We’re inside the headquarters of Outpace Bio in Seattle. Senior scientist Bobby Langan shows a video of one of his favorite deep-learning tools, used to create experimental cancer therapeutics.
A fuzzy blob coalesces into a molecular model of a protein, its pastel atoms wrapping precisely around its target, a regulator of the immune response.
The software operates much like the AI tool DALL-E, known for generating images through verbal prompts. But instead of psychedelic cats or lurid landscapes, the tool at Outpace dreams up designs of proteins, one of life’s key building blocks. “It starts from scratch and generates an image,” said Langan of the software, called RF Diffusion.
RF Diffusion is part of fast-growing set of AI tools transforming how scientists forge proteins into drugs, industrial enzymes, biosensors, food products and more.
Seattle companies, many of them spinouts of the University of Washington’s Institute for Protein Design, are at the forefront.
“Ten to 15 years ago we were kind of out in the lunatic fringe,” said IPD head David Baker. “It’s interesting to be at center stage now. There’s so much that can be done,” added Baker, whose institute has fostered more than a dozen spinouts and affiliated companies pushing the technology forward.
The field of protein design is still young. Last year, a COVID-19 vaccine with origins at the IPD was the first vaccine or therapy based on computational design to gain regulatory approval, in South Korea.
But the potential market is vast, Cyrus Biotechnology CEO Lucas Nivon told GeekWire. Proteins have the ability to fold into shapes that can precisely target biological molecules. Last year, biologics like protein-based therapeutics accounted for a third of drug approvals.
“In ten years, most biologics are going to be partly or completely designed by algorithmic methods. We are just at the beginning of that transformation,” said Nivon in a recent talk surveying the field.
The same rapid AI advances powering generative AI tools like OpenAI’s DALL-E and ChatGPT are also driving changes in how scientists engineer proteins.
Some of the new protein design tools even are “large language models” like ChatGPT, which creates text from prompts. After all, proteins are built from an alphabet of 20 amino acids.
The expanding field
The field of protein design exploded in 2020 when Alphabet’s DeepMind showcased a deep learning tool that predicts how proteins fold by using their amino acid sequences as input.
The folding process depends on multiple molecular interactions within the protein and its environment, which shift as a protein bends into loops, pockets, active sites, and a host of other parts.
DeepMind’s AlphaFold tool and similar software developed by the IPD won the “Breakthrough of the Year” award from Science magazine in 2021. The tools stunned scientists with their speed and accuracy.
AlphaFold is “game-changing,” said Sue Biggins, head of the Basic Sciences Division at Fred Hutchinson Cancer Center, where the software is widely used.
A host of proprietary and open-access AI tools now go even further, helping scientists to generate new proteins from scratch or improve existing ones.
“The tools we’ve been developing are so new that most companies haven’t really adapted to using them yet,” said Baker.
Outpace customizes IPD’s open access tools like RF Diffusion with software scripts, prompts and training data specific to its protein engineering goals.
Last year, Outpace joined Cyrus, Arzeda and others as founding members of OpenFold, a consortium that shares and improves on protein design software, with the intent of speeding up innovation for everyone. Some companies also use proprietary tools.
The innovations are driving investment. Just last year, industrial protein design company Arzeda landed $33 million, biosensor company Monod Bio raised $25 million and Vilya launched with $50 million. These and other IPD spinouts have raised more than $1 billion in total and forged partnerships with biopharma companies like Bristol Myers Squibb and Gilead Sciences.
Last year, Vancouver, Wash.-based Absci inked a partnership with Merck worth up to $610 million, and Seattle-area Just – Evotec Biologics recently landed up to $75 million from the U.S. Department of Defense to develop antibody-based therapeutics. Other protein design startups include DeepMind spinout Isomorphic Labs, Cradle, Generate Biomedicines and Profluent.
These companies are part of a larger biopharma ecosystem leveraging AI for a wide range of uses. AI drug discovery and design startups have raised $10 billion since 2019, according to CBInsights, though the pace slowed last year as biotech funding cooled.
Examples of broader uses include mining databases for new drug targets, designing small molecule drugs, and developing RNA-based treatments, the focus of Seattle startup Shape Therapeutics.
Outpace spun out of Seattle and South San Francisco cell therapy company Lyell Immunopharma, which uses engineered proteins to reprogram immune cells to attack tumors. The Outpace team helped make designs now being tested by Lyell in clinical trials for breast, lung and other solid tumors.
Outpace aims to make proteins that “rewire how cells are making decisions and how they’re instructing the cells around them,” said CEO and co-founder Marc Lajoie, a former IPD postdoc.
Using the toolkit
At Outpace, Langan and his colleagues turn to another IPD tool to flesh out the backbone images generated by RF Diffusion. This second deep-learning tool, ProteinMPNN, fills in amino acid sequences onto the protein scaffolds.
But just as DALL-E can “hallucinate” an extra finger or nose, AI tools for protein engineering can make up designs that won’t properly fold up or function. “The problem is, you don’t actually know whether the thing that’s up on the screen is even realistic,” said Lajoie.
With ChatGPT, users can readily check the output for facts. Checking the output of protein design tools requires a lot more effort.
Outpace scientists first test their designs computationally, asking AlphaFold or a similar tool from OpenFold to predict if the designs will fold into the desired structure. The proteins that make the cut are tested in the lab.
Outpace’s lab hums with gleaming devices that synthesize proteins and assess whether they bind the correct target, activate certain cellular signals, and otherwise behave appropriately. A scientist operates a machine spitting out a graph with a red peak indicating that a protein is stable, and not breaking down into smaller bits.
The scientists feed such data from the lab back into their models. With each iteration of computational design and assessment, they get closer to making proteins that fold and act how they want.
The process can shave years off of the timeline for making new therapeutic candidates, said Lajoie. Many other companies also have a similar iterative workflow.
Only a few years ago, scientists would produce reams of code for their protein designs — newer tools require only short prompts, said Lajoie.
More than hype
While some companies focus on making entirely new proteins from scratch, others like Cyrus aim to improve on existing protein therapeutics.
Cyrus’ projects include improving a protein-based drug, Fabrazyme, that is often neutralized by the immune system in the patients who use it, for a rare condition called Fabry disease.
AI has still not overtaken older “statistical physical” tools for some uses, said Nivon — such as predicting the effects of swapping out one amino acid for another. In addition to AI, Cyrus uses an older IPD tool, Rosetta, which incorporates physical rules and statistics about amino acid interactions.
Ramy Farid, the CEO of software and drug company Schrödinger, recently cautioned against overhyping the potential of machine learning. At the same time, Farid also said that computational methods stand to sharply reduce the time needed to generate new drug candidates.
Arzeda co-founder and CEO Alexandre Zanghellini said recent protein design advances are “profoundly significant.” But companies also need biochemical expertise as well as experimental validation and commercial-scale manufacturing capabilities, he said.
“Instead of overhype, I see perhaps excessive confidence and naivety from newcomers in the field that think that simply having the latest deep-learning model will allow them to successfully design and commercialize proteins.” said Zanghellini.
In the future, the tools for protein design will only get better, said Baker. Active areas of research include improving the design of catalysts to speed up biochemical reactions, such as chewing up other proteins for therapeutic or industrial use, he said. Said Baker: “The sky’s the limit right now.”
Wide range of approaches and products
We compiled the list below to spotlight how Washington state companies leverage protein design. For a more comprehensive list of companies, check out this story on IPD spinouts.
Arzeda
What they do: Arzeda creates proteins for industrial purposes, such as enzymes that convert agricultural material like stevia leaves into natural sweeteners, or vegetable oil into animal-fat replacements for alternative meat. In one project, Arzeda partners with a larger company to improve detergents, using stain-removing enzymes that work at low temperatures. The company’s tech suite includes proprietary large language models iteratively trained on in-house data to improve protein stability, activity and manufacturability. Arzeda’s capability spans generative AI to commercial manufacturing.
CEO quote: “We examine more combinations of amino acids than there are drops of water in the Pacific ocean to create in-silico proteins and enzymes with specific functional properties,” said co-founder and CEO Alexandre Zanghellini. “I hypothesize that two to three decades from now, most of the products around us will be incorporating new proteins designed by computers, or they will have been made with computationally designed proteins.”
More: Enzyme design startup lands $33M
Monod Bio
What they do: Monod Bio develops biosensors that light up when they bind targets such as the COVID-19 virus or molecules associated with cancer. Upon binding, the sensor proteins switch conformation and create a luminescent signal that can be read with a simple device. The IPD spinout uses internally designed algorithms and IPD software to design its sensors, meant for use as research tools or clinical diagnostics.
CEO quote: “Now we can design proteins and protein assemblies with new geometries and new protein-protein binding interfaces with unprecedented accuracy and speed. We believe the next frontier is designing proteins that bind to other biomolecules, such as small molecules, with the same precision and speed,” said Monod CEO and co-founder Daniel Adriano-Silva.
More: Institute for Protein Design spinout Monod Bio raises $25M for molecular biosensors
A-Alpha Bio
What they do: A-Alpha Bio combines machine learning with a proprietary experimental platform that can measure millions of protein-protein interactions, such as between an antibody and its target on a virus.
CEO quote: “Our primary engineering strategy is to generate a lot of experimental data on how proteins behave (specifically, how they bind to each other), and then train machine learning models to predict new sequences that will behave even better,” said David Younger, CEO and co-founder of the IPD spinout. Younger said that the quality and quantity of the data used to train models is key to future advances. “With enough data, we expect that fully in-silico protein engineering will be possible.”
More: University of Washington spinout A-Alpha Bio snags $20M for protein-discovery platform
Absci
What they do: Absci uses proprietary software combined with lab-based validation techniques to design and test its proteins. The company recently showcased a new approach to developing therapeutic antibodies.
CEO quote: Biologics like antibodies have “unmatched ability to precisely target disease pathways, offering transformative therapies for condition considered undruggable,” said Sean McClain, CEO and founder of the Vancouver, Wash.-based company.
“As generative AI improves structural and functional predictions, and as computational power continues to grow, we will get better at generalizing protein design principles,” said McClain. As a result, researchers will be able to reduce animal and human testing and generate more exact and personalized precision therapeutics, he said.
Added McClain: “If the unparalleled speed of uptake for ChatGPT is any indication, personalized medicine is going to happen a lot faster than people think.”
More: Biotech company Absci, which recently went public, cuts ribbon on new HQ in Vancouver, Wash.