Academic

Can we trust AI to detect healthy multilingual English speakers among the cognitively impaired cohort in the UK? An investigation using real-world conversational speech

arXiv:2602.13047v1 Announce Type: new Abstract: Conversational speech often reveals early signs of cognitive decline, such as dementia and MCI. In the UK, one in four people belongs to an ethnic minority, and dementia prevalence is expected to rise most rapidly among Black and Asian communities. This study examines the trustworthiness of AI models, specifically the presence of bias, in detecting healthy multilingual English speakers among the cognitively impaired cohort, to make these tools clinically beneficial. For experiments, monolingual participants were recruited nationally (UK), and multilingual speakers were enrolled from four community centres in Sheffield and Bradford. In addition to a non-native English accent, multilinguals spoke Somali, Chinese, or South Asian languages, who were further divided into two Yorkshire accents (West and South) to challenge the efficiency of the AI tools thoroughly. Although ASR systems showed no significant bias across groups, classification a

arXiv:2602.13047v1 Announce Type: new Abstract: Conversational speech often reveals early signs of cognitive decline, such as dementia and MCI. In the UK, one in four people belongs to an ethnic minority, and dementia prevalence is expected to rise most rapidly among Black and Asian communities. This study examines the trustworthiness of AI models, specifically the presence of bias, in detecting healthy multilingual English speakers among the cognitively impaired cohort, to make these tools clinically beneficial. For experiments, monolingual participants were recruited nationally (UK), and multilingual speakers were enrolled from four community centres in Sheffield and Bradford. In addition to a non-native English accent, multilinguals spoke Somali, Chinese, or South Asian languages, who were further divided into two Yorkshire accents (West and South) to challenge the efficiency of the AI tools thoroughly. Although ASR systems showed no significant bias across groups, classification and regression models using acoustic and linguistic features exhibited bias against multilingual speakers, particularly in memory, fluency, and reading tasks. This bias was more pronounced when models were trained on the publicly available DementiaBank dataset. Moreover, multilinguals were more likely to be misclassified as having cognitive decline. This study is the first of its kind to discover that, despite their strong overall performance, current AI models show bias against multilingual individuals from ethnic minority backgrounds in the UK, and they are also more likely to misclassify speakers with a certain accent (South Yorkshire) as living with a more severe cognitive decline. In this pilot study, we conclude that the existing AI tools are therefore not yet reliable for diagnostic use in these populations, and we aim to address this in future work by developing more generalisable, bias-mitigated models.

Executive Summary

This study investigates the reliability and potential biases of AI models in detecting cognitive decline among multilingual English speakers in the UK. The research focuses on the performance of AI tools in distinguishing healthy individuals from those with cognitive impairments, particularly within ethnic minority communities. The findings reveal significant biases against multilingual speakers, especially those with non-native accents, leading to higher misclassification rates. The study highlights the limitations of current AI models trained on datasets like DementiaBank, which may not be generalizable to diverse populations. The authors call for the development of more inclusive and bias-mitigated AI tools to ensure accurate and equitable diagnostic outcomes.

Key Points

  • AI models show bias against multilingual speakers, particularly in memory, fluency, and reading tasks.
  • Multilingual individuals are more likely to be misclassified as having cognitive decline.
  • Bias is more pronounced when models are trained on the DementiaBank dataset.
  • South Yorkshire accents are more likely to be misclassified as indicating severe cognitive decline.

Merits

Innovative Approach

The study is the first to examine AI bias in detecting cognitive decline among multilingual speakers in the UK, providing valuable insights into the limitations of current AI tools.

Comprehensive Data Collection

The research includes a diverse sample of participants, including monolingual and multilingual speakers from various linguistic backgrounds, enhancing the robustness of the findings.

Demerits

Limited Sample Size

The pilot study has a relatively small sample size, which may limit the generalizability of the findings to the broader population.

Dataset Limitations

The reliance on the DementiaBank dataset, which may not be representative of diverse populations, could contribute to the observed biases.

Expert Commentary

The study by [Authors] provides a critical examination of the biases present in AI models used for detecting cognitive decline. The findings are particularly timely given the increasing diversity of the UK population and the rising prevalence of dementia among ethnic minority communities. The identification of biases against multilingual speakers and those with non-native accents underscores the need for more inclusive AI tools. The reliance on datasets like DementiaBank, which may not be representative of diverse populations, highlights the importance of developing more generalizable models. The study's call for bias-mitigated AI tools is a significant step towards ensuring equitable healthcare outcomes. However, the limitations of the pilot study, such as the small sample size, should be addressed in future research to strengthen the validity of the findings. Overall, this study contributes valuable insights to the ongoing discourse on AI ethics and its applications in healthcare.

Recommendations

  • Future research should focus on developing and validating AI models on more diverse and representative datasets to mitigate biases against multilingual and ethnic minority populations.
  • Clinical practitioners should be cautious when interpreting AI-generated diagnostic results, particularly for multilingual speakers, and consider additional assessment methods to ensure accuracy.

Sources