Skip to main content
Academic

Auditing Reciprocal Sentiment Alignment: Inversion Risk, Dialect Representation and Intent Misalignment in Transformers

arXiv:2602.17469v1 Announce Type: new Abstract: The core theme of bidirectional alignment is ensuring that AI systems accurately understand human intent and that humans can trust AI behavior. However, this loop fractures significantly across language barriers. Our research addresses Cross-Lingual Sentiment Misalignment between Bengali and English by benchmarking four transformer architectures. We reveal severe safety and representational failures in current alignment paradigms. We demonstrate that compressed model (mDistilBERT) exhibits 28.7% "Sentiment Inversion Rate," fundamentally misinterpreting positive user intent as negative (or vice versa). Furthermore, we identify systemic nuances affecting human-AI trust, including "Asymmetric Empathy" where some models systematically dampen and others amplify the affective weight of Bengali text relative to its English counterpart. Finally, we reveal a "Modern Bias" in the regional model (IndicBERT), which shows a 57% increase in alignment

N
Nusrat Jahan Lia, Shubhashis Roy Dipta
· · 1 min read · 7 views

arXiv:2602.17469v1 Announce Type: new Abstract: The core theme of bidirectional alignment is ensuring that AI systems accurately understand human intent and that humans can trust AI behavior. However, this loop fractures significantly across language barriers. Our research addresses Cross-Lingual Sentiment Misalignment between Bengali and English by benchmarking four transformer architectures. We reveal severe safety and representational failures in current alignment paradigms. We demonstrate that compressed model (mDistilBERT) exhibits 28.7% "Sentiment Inversion Rate," fundamentally misinterpreting positive user intent as negative (or vice versa). Furthermore, we identify systemic nuances affecting human-AI trust, including "Asymmetric Empathy" where some models systematically dampen and others amplify the affective weight of Bengali text relative to its English counterpart. Finally, we reveal a "Modern Bias" in the regional model (IndicBERT), which shows a 57% increase in alignment error when processing formal (Sadhu) Bengali. We argue that equitable human-AI co-evolution requires pluralistic, culturally grounded alignment that respects language and dialectal diversity over universal compression, which fails to preserve the emotional fidelity required for reciprocal human-AI trust. We recommend that alignment benchmarks incorporate "Affective Stability" metrics that explicitly penalize polarity inversions in low-resource and dialectal contexts.

Executive Summary

This article examines the issue of Cross-Lingual Sentiment Misalignment between Bengali and English languages in transformer architectures. The research reveals significant safety and representational failures, including sentiment inversion rates and asymmetric empathy, which can erode human-AI trust. The study argues for a pluralistic and culturally grounded approach to alignment, emphasizing the need to respect language and dialectal diversity. The authors recommend incorporating affective stability metrics to penalize polarity inversions in low-resource and dialectal contexts.

Key Points

  • Cross-Lingual Sentiment Misalignment between Bengali and English languages
  • Severe safety and representational failures in current alignment paradigms
  • Need for pluralistic and culturally grounded alignment approaches

Merits

Comprehensive Benchmarking

The study benchmarks four transformer architectures, providing a thorough analysis of their performance and limitations.

Demerits

Limited Scope

The research focuses on a specific language pair (Bengali and English), which may limit its generalizability to other languages and contexts.

Expert Commentary

This study contributes significantly to our understanding of the complexities involved in cross-lingual sentiment analysis. The findings underscore the need for a more nuanced approach to alignment, one that acknowledges the complexities of language and dialectal diversity. By highlighting the limitations of current approaches, the authors provide a compelling case for the development of more culturally grounded and pluralistic AI systems. The recommendations for incorporating affective stability metrics are particularly noteworthy, as they offer a concrete step towards mitigating the risks associated with sentiment inversion and asymmetric empathy.

Recommendations

  • Develop and incorporate affective stability metrics in alignment benchmarks
  • Prioritize culturally grounded and pluralistic approaches to AI system design

Sources