Dr. Thomas H. Puzia | Next-Generation Galaxy Surveys

2025-12-23 Thomas H. Puzia

Finding Cosmic Needles in a Galactic Haystack

Color-color diagrams for all sources with multi-wavelength photometry in the Fornax core region, shown as gray dots. Spectroscopically confirmed samples are shown for GCs (blue), stars (golden) and galaxies (purple). These are used as labeled samples for the SVC model.

Globular clusters (GCs), ancient and dense balls of stars orbiting galaxies, are some of the universe's most abundant fossil records. They encode when and how galaxies assembled themselves billions of years ago. While GCs in the Local Group can be resolved into individual stars, 19 Mpc away they look almost identical to foreground stars in our own Milky Way and distant background galaxies. Traditional approaches using simple color cuts produce contamination rates of 30–70% and make stellar population properties inference ambiguous. That's not science; that's a coin flip with extra steps.

In our recent paper, Ordenes-Briceño et al. (2025), we decided that the humble Support Vector Machine (SVM), an algorithm old enough to have a driver's license, could solve this better than brute-force color selections ever did. Using data from the Next Generation Fornax Survey (NGFS) spanning near-UV to near-infrared (u'g'i'JKs), our team built an SVM classifier trained on ~5,000 spectroscopically confirmed sources: ~1,200 globular clusters, ~2,100 foreground stars, and ~1,600 background galaxies. We fed it 15 features: 10 color combinations plus morphological parameters like FWHM and the spread model. The clever insight was: not all features are created equal. Through permutation importance analysis and correlation clustering, we could prune down to just 7 features, i.e. five colors spanning the full UV-to-NIR baseline plus two structural parameters. This leaner model hits 96.6% accuracy with ~10% misclassification, while avoiding the overfitting trap that inflated the 15-feature model's scores.

The color pair (u'−g') vs. (g'−Ks) emerged as the MVP: connecting near-UV to near-IR gives maximum leverage for separating the three populations. Drop the u'-band? Performance degrades. Drop the NIR? You're back to selecting nearly 12,000 "globular clusters”, i.e. three times too many.

This has direct implications for the coming data tsunami. The Vera Rubin Observatory's LSST will detect millions of unresolved sources, and Euclid is already flying and delivered the first batch of early release data. We, therefore, tested LSST-like filter combinations and found that without NIR support from Euclid or Roman, optical-only classification remains suboptimal. Thus, the message to survey designers is clear: your UV and NIR bands aren't luxury add-ons, they're load-bearing walls for astrophysics research.

The final catalog delivers 2,926 globular cluster candidates in the Fornax cluster core, a clean, probability-vetted sample ready for serious science on dark matter halos, intracluster light, and galaxy assembly history. We are actively exploiting these data for some stunning new insights, so stay tuned.

Sometimes the best tool for the job isn't the newest deep learning architecture, it's a well-tuned classic with the right data behind it.

Fornax NGFS survey SVM globular clusters galaxies stars near-UV near-IR

Neighborhood Watch: Mapping the Local Group Beyond the Virial Radius

Our Neighborhood Watch program extends the reach of next-generation surveys into the low-density outskirts of nearby galaxy clusters. By combining deep optical and near-infrared photometry we are uncovering faint dwarf galaxies and diffuse stellar streams that trace the ongoing assembly of large-scale structure at the virial boundary and beyond.

NGVS dwarf galaxies large-scale structure

2024-11-20 Thomas H. Puzia

NGVS-IR: Deep Near-Infrared Imaging of the Virgo Cluster

The NGVS-IR program uses WIRCam on the CFHT 3.6 m and VIRCAM on the VISTA 4 m telescopes to obtain deep J- and Ks-band imaging of the Virgo cluster core around M87 and M49, reaching depths of J ≈ 23.0 and Ks ≈ 23.0 AB mag. These data provide critical rest-frame stellar-mass tracers for thousands of compact stellar systems and low-surface-brightness galaxies.

NGVS-IR near-infrared Virgo cluster

Next-Generation Galaxy Surveys

Latest Posts

Finding Cosmic Needles in a Galactic Haystack

Neighborhood Watch: Mapping the Local Group Beyond the Virial Radius

NGVS-IR: Deep Near-Infrared Imaging of the Virgo Cluster