Raman spectroscopy has been widely used for spectral analysis in biological analysis, real-time monitoring of chemical reactions, material identification, etc. However, the variation in spectra collected by different instruments is a challenge which can impact the accuracy and reproducibility of results. Using methods such as environmental parameter correction and standard Raman spectral databases is insufficient to fully mitigate inter-instrument variation.
In a study published in Spectrochimica Acta, researchers from the Changchun Institute of Optics, Fine Mechanics and Physics of the Chinese Academy of Sciences addressed the issue of inter-instrument variation in Raman spectroscopy using a Siamese network. They introduced a modular design that facilitates the fusion training and mutual prediction of Raman spectra, enhancing the accuracy and reliability of spectral analysis across different instruments.
Researchers first assembled three Raman spectral datasets collected from different bacterial strains using different Raman spectrometers. The datasets were preprocessed to remove spikes and subtract baselines, and to normalize the spectra. Researchers then evaluated their performance in classifying the Raman spectra utilizing three classification models with different architectures, ResNet, Transformer, and LSTM.
Instead of directly training these models for classification, researchers adopted a Siamese network approach. The Siamese network compares whether Raman spectra collected by two instruments belong to the same class, outputting a feature distance between the two spectra that is then mapped into a similarity value. This approach allows for the determination of the unknown spectrum's class by comparing its similarity values with the reference spectra of each known class.
The Siamese network was implemented with multiple projection layers to enable a modular design, allowing for the decoupling of the spectral encoding layer from the classifier. This design makes it easier to plug and play different spectral encoders, facilitating the fusion training and mutual prediction of Raman spectra of different lengths.
The Siamese network approach significantly improved the classification accuracy of Raman spectra across different instruments, especially when dealing with spectra of the same resolution. The transformer-based Siamese network demonstrated the best performance, attributed to its ability to process sequential data in parallel through its attention mechanism.
By enabling more accurate and reproducible spectral analysis across different instruments, the Siamese network approach will benefit works such as biological analysis and material identification where the ability to obtain consistent and reliable results is crucial.