AI could uncover new physics faster but there’s a surprising catch

Artificial intelligence is already playing a major role in helping cosmologists study the universe. Now, new research suggests a machine learning technique called transfer learning could make the search for new physics much faster and less expensive. However, the study also uncovered a surprising downside: AI can sometimes become so dependent on what it has already learned that it struggles to recognize something truly new.

The study, published in the Journal of Cosmology and Astroparticle Physics (JCAP), examined how transfer learning might help researchers investigate theories that go beyond the standard cosmological model.

AI and the Search for New Physics

The current standard model of cosmology, known as ΛCDM, successfully explains many large-scale features of the universe, including its expansion and the distribution of galaxies. Yet scientists believe the model is not the final answer.

Recent observations have raised questions that could point toward new physics, including the effects of massive neutrinos, modified gravity, and evolving dark energy. Exploring these possibilities requires researchers to generate enormous numbers of detailed computer simulations, each representing a virtual universe built using different physical assumptions.

Producing these simulations is computationally expensive and often demands substantial computing power.

Using Transfer Learning to Reduce Simulation Costs

The researchers investigated whether transfer learning could make this process more efficient.

Transfer learning allows an AI system to apply knowledge gained from one task to another related task. Instead of training a neural network entirely on the most complex and computationally costly simulations, the team first trained it on simpler simulations based on ΛCDM. This initial phase, known as pretraining, was then followed by additional training using more sophisticated models that include potential new physics.

“It’s basically a shortcut,” explains Adrian Bayer a cosmologist at the Flatiron Institute and Princeton University, co-author of the study. “Usually people train the AI directly on the most computationally expensive simulations. What we do instead is first use simpler and less expensive ΛCDM simulations to give the AI an idea of what’s happening, and only afterward move to the more complex models.”

Bayer compares the approach to learning from textbooks.

“You first read a basic book to get an idea of the knowledge,” says Bayer, “and then move to the really complicated book.”

According to first author Veena Krishnaraj, an undergraduate student at Princeton University, this strategy prevents the AI from having to “digest everything at once.”

The results were striking. In some cases, transfer learning reduced the number of expensive simulations required by more than a factor of ten.

When Prior Knowledge Becomes a Problem

The study also revealed a less obvious challenge known as negative transfer.

Using Bayer’s textbook comparison, imagine learning medicine from an introductory text and then encountering a rare disease that closely resembles a common condition. Existing knowledge is usually helpful, but it can sometimes encourage the wrong conclusion.

The same issue can arise in AI systems.

In some cases, the signatures of new physics resemble patterns that the AI has already associated with the standard cosmological model. When that happens, the pretrained network may interpret unfamiliar information through the lens of what it already knows, making it harder to recognize genuinely new effects.

The researchers saw this effect while studying simulations that included massive neutrinos. Some of the observational signatures linked to neutrino mass closely resemble changes associated with an existing ΛCDM parameter called σ8, which measures how strongly matter clusters throughout the universe.

Because of this similarity, the pretrained neural network initially had difficulty telling the two effects apart.

“The negative transfer is not random. It is driven by underlying physical degeneracies in the model,” says Krishnaraj.

In other words, different physical processes can produce very similar observable signatures, making it challenging for the AI to correctly identify which parameter is responsible.

“So this is something we need to be aware of and try to mitigate,” she concludes.

Promise and Risks for Future Cosmology

The findings highlight both the potential benefits and limitations of applying foundation model concepts to physics. These approaches are broadly similar in spirit to the techniques behind modern generative AI systems and large language models.

As the researchers note in the paper, pretraining can speed up inference, “but may also hinder learning new physics.”

So far, the approach has only been tested using simulations. The next step will be applying it to real astronomical observations.

The team believes transfer learning could become an important tool for upcoming cosmological surveys, which are expected to collect unprecedented amounts of high-precision data about the universe in the years ahead.

The paper, “Transfer Learning Beyond the Standard Model” by Veena Krishnaraj, Adrian E. Bayer, Christian Kragh Jespersen, and Peter Melchior, is now available in JSTAT.

Source link