中文 |

Newsroom

New AI Model Detects Early Neurological Disorders with High Accuracy

Jun 27, 2025

A research team led by Prof. LI Hai from the Hefei Institutes of Physical Science of the Chinese Academy of Sciences has developed a novel deep learning framework that significantly improves the accuracy and interpretability of detecting neurological disorders through speech. 

The findings were recently published in Neurocomputing.

"A slight change in the way we speak might be more than just a slip of the tongue—it could be a warning sign from the brain," said Prof. LI Hai, who led the team. "Our new model can detect early symptoms of neurological diseases such as Parkinson's, Huntington's, and Wilson disease—by analyzing voice recordings."

Dysarthria is a common early symptom of various neurological disorders. Since speech abnormalities often reflect underlying neurodegenerative processes, voice signals have emerged as promising noninvasive biomarkers for the early screening and continuous monitoring of such conditions. Automated speech analysis offers high efficiency, low cost, and non-invasiveness. However, current mainstream methods often suffer from over-reliance on handcrafted features, limited capacity to model temporal-variable interactions, and poor interpretability.

To address these challenges, the researchers proposed the Cross-Time and Cross-Axis Interactive Transformer (CTCAIT) for multivariate time series analysis. This framework first employs a large-scale audio model to extract high-dimensional temporal features from speech, representing them as multidimensional embeddings along time and feature axes. It then uses the Inception Time network to capture multi-scale and multi-level patterns within the time series. 

By integrating cross-time and cross-channel multi-head attention mechanisms, CTCAIT effectively captures pathological speech signatures embedded across different dimensions.

The method achieved a detection accuracy of 92.06% on a Mandarin Chinese dataset and 87.73% on an external English dataset, demonstrating strong cross-linguistic generalizability.

Furthermore, the researchers conducted interpretability analyses of the model's internal decision-making processes and systematically compared the effectiveness of different speech tasks, offering valuable insights for its potential clinical deployment.

These efforts provide important guidance for potential clinical applications of the method in the early diagnosis and monitoring of neurological disorders.

This study was supported by the National Natural Science Foundation of China, the Natural Science Foundation of Anhui Province, and the Anhui Provincial Key Research and Development Program.

Contact

ZHAO Weiwei

Hefei Institutes of Physical Science

E-mail:

Multivariate time series approach integrating cross-temporal and cross-channel attention for dysarthria detection from speech

Related Articles
Contact Us
  • 86-10-68597521 (day)

    86-10-68597289 (night)

  • 52 Sanlihe Rd., Xicheng District,

    Beijing, China (100864)

Copyright © 2002 - Chinese Academy of Sciences