Humans Can Still Find Galaxies That Machine Learning Algorithms Miss

The age of big data is upon us, and there are scarcely any fields of scientific research that are not affected. Take astronomy, for example. Thanks to cutting-edge instruments, software, and data-sharing, observatories worldwide are accumulating hundreds of terabytes in a single day and between 100 to 200 Petabytes a year. Once next-generation telescopes become operational, astronomy will likely enter the “exabyte era,” where 1018 bytes (one quintillion) of data are obtained annually. To keep up with this volume, astronomers are turning to machine learning and AI to handle the job of analysis.

While AI plays a growing role in data analysis, there are some instances where citizen astronomers are proving more capable. While examining data collected by the Dark Energy Survey (DES), amateur astronomer Giuseppe Donatiello discovered three faint galaxies that a machine-learning algorithm had apparently missed. These galaxies, all satellites of the Sculptor Galaxy (NGC 253), are now named Donatello II, III, and IV, in his honor. In this day of data-driven research, it’s good to know that sometimes there’s no substitute for human eyeballs and intellect.

Right in the middle of this image lies the newly discovered dwarf galaxy known as Donatiello II, one of three newly discovered galaxies Credit: ESA/Hubble/NASA/B. Mutlu-Pakdil; Acknowledgement: G. Donatiello

The presence of these satellites around the Sculptor Galaxy (NGC 253), located 11.4 million light-years from Earth, was confirmed by a team of astronomers using the Hubble Space Telescope. The team was led by Burçin Mutlu-Pakdil, an assistant professor of astrophysics at Dartmouth College (for whom Burçin’s Galaxy is named). The image below was part of a series of long-exposure images of faint galaxies, which shows Donatiello II in the center. The image has since become a Picture of the Week on the European Space Agency’s (ESA) website.

Reliance on AI has increased considerably in recent years in direct response to the exponential increase in data obtained by astronomical observatories. In recent months, machine learning algorithms have been developed to search for exoplanets, fast radio bursts (FRBs), possible technosignatures, and mapping the epoch known as Cosmic Dawn. But when it came to the DES, an international collaboration dedicated to mapping the cosmos to measure the nature and influence of Dark Energy, the algorithm they used failed to detect these satellite galaxies.

This is not that surprising since even the best algorithms have their limitations. To develop machine learning techniques, astronomers will train their algorithms using images and data of specific phenomena. Because some galaxies are so faint, AIs have difficulty distinguishing between them and individual stars and background noise. When that happens, identification must be done using the old-fashioned method of trained eyeballs combing through stacks of images and raw data.

Take that, Skynet!

Further Reading: NASA, ESA