Facial Recognition Deep Learning Software is Surprisingly Good at Identifying Galaxies Too

A lot of attention has been dedicated to the machine learning technique known as “deep learning”, where computers are capable of discerning patterns in data without being specifically programmed to do so. In recent years, this technique has been applied to a number of applications, which include voice and facial recognition for social media platforms like Facebook.

However, astronomers are also benefiting from deep learning, which is helping them to analyze images of galaxies and understand how they form and evolve. In a new study, a team of international researchers used a deep learning algorithm to analyze images of galaxies from the Hubble Space Telescope. This method proved effective at classifying these galaxies based on what stage they were in their evolution.

The study, titled “Deep Learning Identifies High-z Galaxies in a Central Blue Nugget Phase in a Characteristic Mass Range“, recently appeared online and has been accepted for publication in the Astrophysical Journal. The study was led by Marc Huertes-Company of the University Paris Diderot and included members from the University of California Santa Cruz (UCSC), the Hebrew University, the Space Telescope Science Institute, the University of Pennsylvania Philadelphia, MINES ParisTech and Shanghai Normal University (SNHU).

A ‘deep learning’ algorithm trained on images from cosmological simulations is surprisingly successful at classifying real galaxies in Hubble images. Credit: HST/CANDELS

In the past, Marc Huertas-Company has already applied deep learning methods to Hubble data for the sake of galaxy classification. In collaboration with David Koo and Joel Primack, both of whom are professor emeritus’ at UC Santa Cruz (and with support from Google), Huertas-Company and the team spent the past two summers developing a neural network that could identify galaxies at different stages in their evolution.

“This project was just one of several ideas we had,” said Koo in a recent USCS press release. “We wanted to pick a process that theorists can define clearly based on the simulations, and that has something to do with how a galaxy looks, then have the deep learning algorithm look for it in the observations. We’re just beginning to explore this new way of doing research. It’s a new way of melding theory and observations.”

For the sake of their study, the researchers used computer simulations to generate mock images of galaxies as they would look in observations by the Hubble Space Telescope. The mock images were used to train the deep learning neural network to recognize three key phases of galaxy evolution that had been previously identified in the simulations. The researchers then used the network to analyze a large set of actual Hubble images.

As with previous images anaylzed by Huertas-Company, these images part of Hubble’s Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey (CANDELS) project – the largest project in the history of the Hubble Space Telescope. What they found was that the neural network’s classifications of simulated and real galaxies was remarkably consistent. As Joel Primack explained:

“We were not expecting it to be all that successful. I’m amazed at how powerful this is. We know the simulations have limitations, so we don’t want to make too strong a claim. But we don’t think this is just a lucky fluke.”

A spiral galaxy ablaze in the blue light of young stars from ongoing star formation (left) and an elliptical galaxy bathed in the red light of old stars (right). Credit: SDSS


The research team was especially interested in galaxies that have a small, dense, star-forming region known as a “blue nugget”. These regions occur early in the evolution of gas-rich galaxies, when big flows of gas into the center of a galaxy cause the formation of young stars that emit blue light. To simulate these and other types of galaxies, the team relied on state-of-the-art VELA simulations developed by Primack and an international team of collaborators.

In both the simulated and observational data, the computer program found that the “blue nugget” phase occurs only in galaxies with masses within a certain range. This was followed by star formation ending in the central region, leading to the compact “red nugget” phase, where the stars in the central region exit their main sequence phase and become red giants.

The consistency of the mass range was exciting because it indicated that the neural network was identifying a pattern that results from a key physical process in real galaxies – and without having to be specifically told to do so. As Koo indicated, this study as a big step forward for astronomy and AI, but a lot of research still needs to be done:

“The VELA simulations have had a lot of success in terms of helping us understand the CANDELS observations. Nobody has perfect simulations, though. As we continue this work, we will keep developing better simulations.”

Artist’s representation of an active galactic nucleus (AGN) at the center of a galaxy. Credit: NASA/CXC/M.Weiss

For instance, the team’s simulations did not include the role played by Active Galactic Nuclei (AGN). In larger galaxies, gas and dust is accreted onto a central Supermassive Black Hole (SMBH) at the core, which causes gas and radiation to be ejected in huge jets. Some recent studies have indicated how this may have an arresting effect on star formation in galaxies.

However, observations of distant, younger galaxies have shown evidence of the phenomenon observed in the team’s simulations, where gas-rich cores lead to the blue nugget phase. According to Koo, using deep learning to study galactic evolution has the potential to reveal previously undetected aspects of observational data. Instead of observing galaxies as snapshots in time, astronomers will be able to simulate how they evolve over billions of years.

“Deep learning looks for patterns, and the machine can see patterns that are so complex that we humans don’t see them,” he said. “We want to do a lot more testing of this approach, but in this proof-of-concept study, the machine seemed to successfully find in the data the different stages of galaxy evolution identified in the simulations.”

In the future, astronomers will have more observation data to analyze thanks to the deployment of next-generation telescopes like the Large Synoptic Survey Telescope (LSST), the James Webb Space Telescope (JWST), and the Wide-Field Infrared Survey Telescope (WFIRST). These telescopes will provide even more massive datasets, which can then be analyzed by machine learning methods to determine what patterns exist.

Astronomy and artificial intelligence, working together to better our understanding of the Universe. I wonder if we should put it on the task of finding a Theory of Everything (ToE) too!

Further Reading: UCSC, Astrophysical Journal