When Semantic Overlap Is Not Enough: Cross-Lingual Euphemism Transfer Between Turkish and English
arXiv:2602.16957v1 Announce Type: new Abstract: Euphemisms substitute socially sensitive expressions, often softening or reframing meaning, and their reliance on cultural and pragmatic context complicates modeling across languages. In this study, we investigate how cross-lingual equivalence influences transfer in multilingual euphemism detection. We categorize Potentially Euphemistic Terms (PETs) in Turkish and English into Overlapping (OPETs) and Non-Overlapping (NOPETs) subsets based on their functional, pragmatic, and semantic alignment. Our findings reveal a transfer asymmetry: semantic overlap is insufficient to guarantee positive transfer, particularly in low-resource Turkish-to-English direction, where performance can degrade even for overlapping euphemisms, and in some cases, improve under NOPET-based training. Differences in label distribution help explain these counterintuitive results. Category-level analysis suggests that transfer may be influenced by domain-specific align
arXiv:2602.16957v1 Announce Type: new Abstract: Euphemisms substitute socially sensitive expressions, often softening or reframing meaning, and their reliance on cultural and pragmatic context complicates modeling across languages. In this study, we investigate how cross-lingual equivalence influences transfer in multilingual euphemism detection. We categorize Potentially Euphemistic Terms (PETs) in Turkish and English into Overlapping (OPETs) and Non-Overlapping (NOPETs) subsets based on their functional, pragmatic, and semantic alignment. Our findings reveal a transfer asymmetry: semantic overlap is insufficient to guarantee positive transfer, particularly in low-resource Turkish-to-English direction, where performance can degrade even for overlapping euphemisms, and in some cases, improve under NOPET-based training. Differences in label distribution help explain these counterintuitive results. Category-level analysis suggests that transfer may be influenced by domain-specific alignment, though evidence is limited by sparsity.
Executive Summary
This study examines cross-lingual euphemism transfer between Turkish and English, categorizing Potentially Euphemistic Terms (PETs) into Overlapping (OPETs) and Non-Overlapping (NOPETs) subsets. The findings reveal a transfer asymmetry, indicating that semantic overlap is insufficient to guarantee positive transfer, particularly in the Turkish-to-English direction. The results highlight the complexities of modeling euphemisms across languages, emphasizing the need for domain-specific alignment and consideration of cultural and pragmatic contexts.
Key Points
- ▸ Cross-lingual euphemism transfer is influenced by semantic overlap, but also by cultural and pragmatic contexts
- ▸ Transfer asymmetry exists, with performance degradation in the Turkish-to-English direction, even for overlapping euphemisms
- ▸ Domain-specific alignment and label distribution differences contribute to the counterintuitive results
Merits
Novel Approach
The study introduces a new framework for categorizing PETs into OPETs and NOPETs, providing a more nuanced understanding of cross-lingual euphemism transfer
Demerits
Data Sparsity
The study is limited by data sparsity, which restricts the generalizability of the findings and the ability to draw more definitive conclusions
Expert Commentary
The study's findings underscore the complexities of modeling euphemisms across languages, highlighting the importance of considering cultural and pragmatic contexts. The introduction of the OPETs and NOPETs framework provides a valuable tool for analyzing cross-lingual euphemism transfer. However, the study's limitations, particularly with regards to data sparsity, must be addressed in future research to fully realize the potential of this framework. As such, the study contributes to a deeper understanding of the challenges and opportunities in cross-lingual euphemism detection, with significant implications for natural language processing and cross-lingual communication.
Recommendations
- ✓ Future studies should prioritize the collection and annotation of larger, more diverse datasets to mitigate the effects of data sparsity
- ✓ Researchers should explore the application of the OPETs and NOPETs framework to other languages and domains, to further validate its effectiveness and generalizability