From the journal Science, October 30:
A human cell swarms with trillions of molecules, including some 42 million proteins and a plethora of carbohydrates, lipids, and nucleic acids. Crowded with organelles and other structures, the cell boasts an intricate organization that makes baroque architecture seem plain. Its cytoplasm is a frenzied chemical lab, with molecules continuously reacting, rearranging, and reshaping. In the nucleus, thousands of genes are constantly switching on and off to turn the seeming chaos into concerted actions that help the cell survive and reproduce.
This complexity is more than the human mind can yet fully understand or predict. But many researchers think artificial intelligence (AI), with its prodigious ability to assimilate and process information, might be up to the task. More than 2 decades ago researchers started to build systems of equations meant to simulate some of the cell’s workings. Now, they have progressed to AI-driven replicas that, like the large language models taking business and popular culture by storm, ingest vast amounts of data to learn on their own. ChatGPT’s attention-grabbing debut nearly 3 years ago inspired the virtual cell builders. “People want this kind of moment for biology,” says Kasia Kedzierska, an AI research scientist at the Allen Institute.
How soon it is coming depends on whom you ask. Virtual cells that emulate their living counterparts would be a boon for many areas of research. In pharma labs, scientists could use them to quickly evaluate large numbers of potential drugs without the expense and difficulty of experiments. They might serve as test beds for engineering cells to perform novel functions. Virtual cells customized to match a patient’s molecular profile could help doctors choose tailored medications. Researchers might even weave cell models into virtual tissues and organs to tackle questions such as how a tumor’s environment affects its growth.
Such models could also help researchers make sense of the vast amount of diverse information pouring into molecular databases, says Theofanis Karaletsos, head of AI for science at the Chan Zuckerberg Initiative (CZI). An AI-powered cell mimic, Karaletsos says, “creates an integrated map of knowledge.”
Like ChatGPT and its ilk, AI cell models have spawned big promises and hefty expectations. “Whenever a new model appears, it’s always the best,” says computational biologist Hani Goodarzi of the Arc Institute, who develops such models himself. In June he and more than 20 other researchers launched the Virtual Cell Challenge, a new contest that will put the models to the test annually. Much like a structural biology competition that started in 1994 and helped researchers largely solve the problem of how proteins fold, the Virtual Cell Challenge is meant to spur improvement in a very complex task. For its debut, it is asking AI aficionados to predict the effects of silencing certain genes in human embryonic stem cells.
So far, more than 1000 teams—with names like Cellamander, Zebulon Chow, SmartCell, and Mean Predictors—have entered and are vying for prizes donated by sponsors including Nvidia, the giant tech company that makes the graphics processing units (GPUs) at the heart of many AIs. On 6 December, contest organizers will reveal the final standings, with the top team taking home $100,000 in cash and GPU time. “We want to learn what works and what doesn’t work,” Goodarzi says.
Even if the models perform well on the test, some scientists expect a long road before they can deliver compelling new science or help biologists. “Despite the hype, [the models] are underperforming,” says Alex Lu of Microsoft Research, who studies how AI can identify patterns in biology data. Some seem to have no more predictive power than simpler simulations. The profusion of models is itself a bad sign, says computational biologist Qin Ma of Ohio State University. “One model should be powerful enough that we wouldn’t need to see so many of them.”
The creators of the models say success is only a matter of time. “We haven’t solved the problem yet, but [the approach] is very promising,” says Bo Wang, head of biomedical AI for Xaira Therapeutics, a company that hopes to harness the technology for drug discovery.
Inspired by advances in computing capabilities, researchers began trying to create virtual cells about 25 years ago. They first used computational methods that rely on large sets of equations to recapitulate metabolism, protein synthesis, DNA duplication, and other cell processes.
In 2012, Jonathan Karr, now a computational systems biologist at the Mount Sinai School of Medicine, and colleagues in Markus Covert’s lab at Stanford University unveiled the first whole-cell model, a silicon version of Mycoplasma genitalium. They chose the microbe because it had the smallest genome of any bacterium known at the time: just over 500 genes, compared with more than 4000 in the familiar Escherichia coli. To replicate the organism’s metabolism, the model calculated the concentrations of more than 700 metabolites as they churn through 1100 chemical reactions. With representations of a chromosome and protein-synthesizing organelles known as ribosomes, the ersatz cell reflected some of the internal structure of its real-life counterpart....
....MUCH MORE