AI model classifies cosmic events using only 15 example images

A new study published in Nature Astronomy shows that researchers from the University of Oxford, Google Cloud, and Radboud University have used Google’s Gemini, a large language model (LLM), to classify cosmic events with high accuracy using only a small set of example images and basic instructions.

The team demonstrated that with just 15 example images and straightforward guidance, Gemini could distinguish genuine cosmic events from imaging artefacts with about 93% accuracy. The AI also generated clear explanations for each classification. This approach could make AI tools more transparent and accessible to scientists without deep expertise in machine learning.

Turan Bulmus, co-lead author from Google Cloud, said: "As someone without formal astronomy training, this research is incredibly exciting. It demonstrates how general-purpose LLMs can democratise scientific discovery, empowering anyone with curiosity to contribute meaningfully to fields they might not have a traditional background in. It's a testament to the power of accessible AI to break down barriers in scientific research."

Modern telescopes generate millions of alerts every night, most of which are false signals caused by things like satellite trails or instrumental artefacts. Traditional machine learning systems often work as "black boxes" that give results without explanations, making it hard for scientists to trust or verify their findings. This challenge is expected to grow with the next generation of telescopes that will produce much larger datasets.

The study tested whether a general-purpose AI like Gemini could match the accuracy of specialised models and also explain its decisions. The researchers provided the model with 15 labelled examples for each of three major sky surveys (ATLAS, MeerLICHT, and Pan-STARRS), each including a new alert image, a reference image, a difference image showing the change, and an expert note. The model then classified thousands of alerts, assigning labels and priority scores with explanations.

Dr Fiorenzo Stoppa from the Department of Physics at the University of Oxford said: "It is striking that a handful of examples and clear text instructions can deliver such accuracy. This makes it possible for a broad range of scientists to develop their own classifiers without deep expertise in training neural networks - only the will to create one."

A panel of 12 astronomers reviewed the AI’s explanations and found them coherent and useful. Professor Stephen Smartt from the Department of Physics at the University of Oxford commented: "I have worked on this problem of rapidly processing data from sky surveys for over 10 years, and we are constantly plagued by weeding out the real events from the bogus signals in the data processing...the LLM’s accuracy at recognising sources with minimal guidance rather than task-specific training was remarkable. If we can engineer to scale this up, it could be a total game changer for the field, another example of AI enabling scientific discovery."

The study also found that Gemini could assess the quality of its own answers by assigning a coherence score to each explanation. Low-coherence answers were more likely to be incorrect. This self-assessment allows the system to flag uncertain cases for human review. By refining the initial examples based on this feedback loop, the researchers improved the model's accuracy on one dataset from about 93.4% to 96.7%.

Looking ahead, the researchers see this method as the basis for autonomous assistants that can integrate different data sources, check their own confidence levels, request follow-up observations automatically, and escalate only the most promising discoveries to human experts. The technique could be adapted quickly for different instruments and research goals since it requires only a small set of examples and simple instructions.

Turan Bulmus added: "We are entering an era where scientific discovery is accelerated not by black-box algorithms, but by transparent AI partners. This work shows a path towards systems that learn with us, explain their reasoning, and empower researchers in any field to focus on what matters most: asking the next great question."

The study is titled ‘Textual interpretation of transient image classifications from large language models’ and is available in Nature Astronomy.

AI model classifies cosmic events using only 15 example images

Related