A new machine learning algorithm could improve the speed and interpretability of spectral analysis across biomedical and materials science applications. Developed by researchers at Rice University and collaborators, the method – called peak-sensitive elastic-net logistic regression (PSE-LR) – is designed to extract diagnostically relevant features from complex optical spectra with a focus on transparency and peak-level resolution.
“Imagine being able to detect early signs of diseases like Alzheimer’s or COVID-19 just by shining a light on a drop of fluid or a tissue sample,” said Ziyang Wang, lead author and doctoral student in electrical and computer engineering at Rice. “Our work makes this possible by teaching computers how to better ‘read’ the signal of light scattered from tiny molecules.”
PSE-LR addresses common challenges in spectral data interpretation by combining logistic regression with a peak-sensitive weighting mechanism and elastic-net regularisation. This enables the model to prioritise sharp, information-rich features while suppressing background variability. Unlike many machine learning models that act as black boxes, PSE-LR generates a feature importance map that highlights which parts of the spectrum contributed most to a classification decision.
“Our algorithm was designed to focus on the most important parts of the signal – the peaks that matter most,” Wang said. “It’s like a detective learning to find clues hidden in light signals.”
In performance tests, PSE-LR outperformed standard approaches including PCA-LDA, support vector machines, and deep neural networks. In simulations, it resolved spectral differences as small as 3 percent in peak intensity. In experimental datasets, it classified ultralow concentrations of the SARS-CoV-2 spike protein’s receptor-binding domain using Raman spectroscopy, identified neuroprotective signatures in mouse brain tissue, distinguished Alzheimer’s disease samples, and separated 2D semiconductor materials using photoluminescence spectra. “Most models either miss the tiny details or are too complex to understand,” Wang added. “We aimed to fix that by building something both smart and explainable.”
According to corresponding author Shengxi Huang, associate professor of electrical and computer engineering, the model’s ability to isolate meaningful signals in dense or overlapping spectra makes it broadly applicable across optical techniques. “Our tool is able to parse light-based data for very subtle signals that are usually hard to pick up on using traditional methods,” she said.
The researchers note that while the algorithm requires some tuning for optimal performance, its structure lends itself well to integration in diagnostic or sensor platforms where interpretability is essential. “These findings could help transform medical diagnostics and materials science,” Wang said, “bringing us closer to a world where smart technologies help detect and respond to health problems faster and more effectively.”