Conexiant
Login
  • The Analytical Scientist
  • The Cannabis Scientist
  • The Medicine Maker
  • The Ophthalmologist
  • The Pathologist
  • The Traditional Scientist
The Analytical Scientist
  • Explore

    Explore

    • Latest
    • News & Research
    • Trends & Challenges
    • Keynote Interviews
    • Opinion & Personal Narratives
    • Product Profiles
    • App Notes

    Featured Topics

    • Mass Spectrometry
    • Chromatography
    • Spectroscopy

    Issues

    • Latest Issue
    • Archive
  • Topics

    Techniques & Tools

    • Mass Spectrometry
    • Chromatography
    • Spectroscopy
    • Microscopy
    • Sensors
    • Data and AI

    • View All Topics

    Applications & Fields

    • Clinical
    • Environmental
    • Food, Beverage & Agriculture
    • Pharma and Biopharma
    • Omics
    • Forensics
  • People & Profiles

    People & Profiles

    • Power List
    • Voices in the Community
    • Sitting Down With
    • Authors & Contributors
  • Business & Education

    Business & Education

    • Innovation
    • Business & Entrepreneurship
    • Career Pathways
  • Events
    • Live Events
    • Webinars
  • Multimedia
    • Video
Subscribe
Subscribe

False

The Analytical Scientist / Issues / 2025 / June / The Dark Metabolome Debate Continues: Siuzdak and Giera Respond
Metabolomics & Lipidomics Omics Opinion & Personal Narratives

The Dark Metabolome Debate Continues: Siuzdak and Giera Respond

Overlooking in-source fragmentation, they argue, could undermine the credibility of metabolomics

By James Strachan 06/05/2025 8 min read

Share

0625-401-Feature-Dark-Metabolome-Suizdak-and-Giera_Photograph.png

Gary Siuzdak and Martin Giera

The Analytical Scientist recently published an interview with Pieter Dorrestein and Yasin El Abiead following publication of their paper that challenged claims made by Martin Giera, Gary Siuzdak, and colleagues “that most unannotated signals in liquid chromatography–tandem mass spectrometry data are in-source fragments, and hence measurement artefacts rather than new molecules” – a conclusion that, they argued, risked downplaying the true complexity of the metabolome.

Now, in this follow-up, Siuzdak and Martin Giera – authors of the original study – respond to the criticism. They defend their findings, clarify misconceptions, and renew their call for greater analytical rigor in distinguishing real metabolites from phantom signals.

Phantom Metabolites Revisited: In-Source Fragmentation

By Gary Siuzdak, Professor and Director, Scripps Center for Metabolomics, Scripps Research, USA

In 2017, we published work on the thermal degradation of small molecules. At first glance, it appeared to be a straightforward observation – small molecules degrade at high temperatures in an inert atmosphere. However, the implications were far-reaching: experiments involving heat, such as GC-MS and chemical derivatization, may not preserve intact molecules or metabolites as assumed. Instead, they can generate degradation products that were never present in the original biological sample. These are real molecules – but ones formed artifactually during analysis – giving rise to what we refer to as “phantom” metabolites or molecules: products that appear genuine, yet originate from thermal breakdown rather than biology. This study sparked a stimulating discussion, as it indirectly challenged years of data that had largely overlooked the potential for significant thermal change.

Fast forward seven years, and the revolution of electrospray ionization in biological and chemical sciences continues and has become a cornerstone of mass spectrometry – a truly transformative advancement. Yet, recognizing the limitations of any technology is crucial for its practical application. In 2024, we published another seemingly simple observation: significant in-source fragmentation (ISF) occurs under standard LC/MS conditions, generating approximately 70 percent of the detected peaks (using a relative intensity of greater than or equal to 5 percent for ISF fragments). This conclusion emerged from analyzing METLIN’s 931,000 molecular standards, where we observed significant fragmentation even without applied collision energy. A recent paper from our lab further validates these results and the importance of considering ISF, among other artifactual signals, when analyzing metabolomic data. This conclusion was further validated by a recent paper by the Li lab, ultimately stating that “Therefore, the ‘dark matter’ of metabolomics is largely explainable.” These papers are also consistent with work by the Patti lab.

The response to our 2024 paper was immediate and polarized, dividing reactions into two distinct camps:

  1. The “ISF is obvious” group. This group saw no controversy, arguing that ISF has been recognized since the 1950s in electron ionization and later in electrospray ionization. However, past electrospray studies examined only a limited subset of molecules, whereas we demonstrated ISF’s prevalence across nearly a million compounds. Here we were able to add statistical weight to the prevalence of ISF.

  2. The denialists. This group rejected our findings and those of the Li lab and Patti lab, and instead continue to embrace the broad “dark metabolome/lipidome” narrative – a narrative which has fueled significant research efforts.

Recently, the denialism camp published a rebuttal claiming we vastly overestimated ISF and artifactual signals. Their argument hinges on conflating the molecule identification with metabolite identification, and using a specialized tuning method that minimizes ISF, to which we respond:

  • Too often, the term “dark metabolome” is sensationalized by conflating detected molecules with true biological metabolites, leading to inflated interpretations of novelty. Without rigorous filtering and biological grounding, this romanticized view risks turning analytical noise into supposed discovery. This is an opportunity for these well-known authors to acknowledge major mis annotation fueling a decade of biological research to identify what has recently been termed “phantom metabolites”.

  • Optimizing for minimal ISF as was done in the rebuttal paper dramatically reduces sensitivity, which is impractical for real-world metabolomics/lipidomics.

  • Granted, low-intensity ISF can be influenced by biological matrix effects. For example, in a lipid- and protein-rich plasma matrix, the Li lab reported 10 percent ISF, while Zamboni observed 34 percent ISF in other experiments. The extent of ISF detected varies depending on both the biological and chemical matrix. In contrast, for the METLIN data, which was typically acquired from mixtures of either 50 or 100 pure molecular standards, matrix interference was minimal, and ISF was observed at 70 percent.

  • We encourage young scientists to follow the data, the pressure to conform to established ideas is strong, but true scientific progress demands objectivity.

The reality is unavoidable – ISF and the vast number of artifactual “phantom” peaks in LC/MS data must be recognized and addressed in metabolomics, lipidomics, and chemical analysis. Continuing to treat them as a treasure trove of biological insight risks misleading interpretations and will only hold the field back.

A Call for Rigor, Not Romanticism

By Martin Giera, Professor, Translational Metabolomics and Lipidomics, and Head, Metabolomics Group, Center for Proteomics and Metabolomics, Leiden University Medical Center (LUMC), The Netherlands

The term “dark metabolome” has captured the imagination of researchers in untargeted metabolomics, suggesting a vast space of uncharacterized metabolites ripe for discovery. In their recent Nature Metabolism article, Dorrestein et al. present a forward-looking vision of this underexplored realm, highlighting the importance of novel tools, databases, and collaborations to uncover unknown molecules. We share the excitement about discovering uncharted metabolic pathways and molecules, yet we also urge caution to ensure that detectability is not conflated with biological relevance, and that molecules are clearly distinguished from metabolites (as per IUPAC definitions).

A range of studies have shown that a substantial portion of signals detected by modern LC-MS workflows arise from in-source fragments (ISFs), adducts, isotopologues, and contaminants. These features inflate the perceived count of “novel” metabolites if not carefully filtered and placed into biological context. Several groups, including Patti et al. (see here and here), have underscored this issue, and our own work likewise highlights how in-source fragmentation can distort downstream interpretations.

A recent preprint by the Li lab supports a more measured perspective, reporting that ISFs account for less than 10 percent of high-quality features in 61 public LC-MS datasets from human plasma and serum. This comparatively lower estimate can in part be attributed to specific instrument settings, analysis equipment, and sample types – underscoring that these artifacts are highly context-dependent. Zamboni et al. likewise found around 30 percent ISF in a separate study, suggesting again that the degree of fragmentation and artifacts can vary widely between platforms and experimental conditions. Regardless of the exact fraction, Liu et al. confirm that the total number of actual metabolites is typically much lower than the total number of raw signals, and that many remaining features are unmatched to HMDB without necessarily being biologically relevant. As Liu et al. succinctly put it: “A compound should only be considered as a biological metabolite based on evidence of its activity.” This principle aligns with earlier “activity metabolomics” concepts.

When looking at broader chemical space, such caution seems well placed. While PubChem catalogs more than 100 million molecules, only around 200,000 or so have been classified as metabolites in databases like HMDB – and even if that number were doubled, it would remain a tiny fraction of the total. On the flip side, Dorrestein et al. state in their recent report “Even after accounting for all ion forms, 82 percent of molecules lacked annotations” only to conclude “this result underscores the enduring presence of the dark metabolome” [my emphasis].

Due to the fact that the metabolome is, by definition, composed of bona fide metabolites rather than any random molecule, it is important to clarify that not all signals detected in untargeted experiments necessarily correspond to endogenous metabolites. Consequently, what Dorrestein et al. refer to as “dark matter” may encompass a wide range of chemical entities – some of which might not be biologically relevant. To avoid overinterpreting these unassigned features, we propose a measured approach that emphasizes analytical rigor, careful pre-annotation, and strong biological validation.

None of this implies that novel metabolites do not remain to be discovered – particularly in underexplored organisms like bacteria or other less-characterized species. Nevertheless, the evidence consistently shows that a significant share of untargeted LC-MS data can originate from non-biological artifacts or from molecules with limited or no metabolic relevance. Accordingly, a thoughtful approach – one emphasizing analytical rigor, systematic pre-annotation, and the careful exploration of biological context – can help us better differentiate real metabolic novelty from instrument- or data-driven artifacts.

We are optimistic that, with these considerations in mind, the field can continue unveiling genuinely unknown biochemistry in a robust way. By combining improved analytical protocols, biological validation, and careful data filtering, a more accurate picture of the “dark metabolome” will emerge – one in which genuine discoveries are grounded in both chemical and functional evidence rather than simply in the presence of unmapped peaks.

Newsletters

Receive the latest analytical science news, personalities, education, and career development – weekly to your inbox.

Newsletter Signup Image

About the Author(s)

James Strachan

Over the course of my Biomedical Sciences degree it dawned on me that my goal of becoming a scientist didn’t quite mesh with my lack of affinity for lab work. Thinking on my decision to pursue biology rather than English at age 15 – despite an aptitude for the latter – I realized that science writing was a way to combine what I loved with what I was good at. From there I set out to gather as much freelancing experience as I could, spending 2 years developing scientific content for International Innovation, before completing an MSc in Science Communication. After gaining invaluable experience in supporting the communications efforts of CERN and IN-PART, I joined Texere – where I am focused on producing consistently engaging, cutting-edge and innovative content for our specialist audiences around the world.

More Articles by James Strachan

False

Advertisement

Recommended

False

False

The Analytical Scientist
Subscribe

About

  • About Us
  • Work at Conexiant Europe
  • Terms and Conditions
  • Privacy Policy
  • Advertise With Us
  • Contact Us

Copyright © 2025 Texere Publishing Limited (trading as Conexiant), with registered number 08113419 whose registered office is at Booths No. 1, Booths Park, Chelford Road, Knutsford, England, WA16 8GS.