Rethinking Metrics for Lexical Semantic Change Detection
arXiv:2602.15716v1 Announce Type: new Abstract: Lexical semantic change detection (LSCD) increasingly relies on contextualised language model embeddings, yet most approaches still quantify change using a small set of semantic change metrics, primarily Average Pairwise Distance (APD) and cosine distance over word prototypes (PRT). We introduce Average Minimum Distance (AMD) and Symmetric Average Minimum Distance (SAMD), new measures that quantify semantic change via local correspondence between word usages across time periods. Across multiple languages, encoder models, and representation spaces, we show that AMD often provides more robust performance, particularly under dimensionality reduction and with non-specialised encoders, while SAMD excels with specialised encoders. We suggest that LSCD may benefit from considering alternative semantic change metrics beyond APD and PRT, with AMD offering a robust option for contextualised embedding-based analysis.
arXiv:2602.15716v1 Announce Type: new Abstract: Lexical semantic change detection (LSCD) increasingly relies on contextualised language model embeddings, yet most approaches still quantify change using a small set of semantic change metrics, primarily Average Pairwise Distance (APD) and cosine distance over word prototypes (PRT). We introduce Average Minimum Distance (AMD) and Symmetric Average Minimum Distance (SAMD), new measures that quantify semantic change via local correspondence between word usages across time periods. Across multiple languages, encoder models, and representation spaces, we show that AMD often provides more robust performance, particularly under dimensionality reduction and with non-specialised encoders, while SAMD excels with specialised encoders. We suggest that LSCD may benefit from considering alternative semantic change metrics beyond APD and PRT, with AMD offering a robust option for contextualised embedding-based analysis.
Executive Summary
The article 'Rethinking Metrics for Lexical Semantic Change Detection' challenges the conventional reliance on Average Pairwise Distance (APD) and cosine distance over word prototypes (PRT) in lexical semantic change detection (LSCD). The authors introduce two new metrics, Average Minimum Distance (AMD) and Symmetric Average Minimum Distance (SAMD), which quantify semantic change by focusing on local correspondence between word usages across different time periods. The study demonstrates that AMD often provides more robust performance, particularly under dimensionality reduction and with non-specialised encoders, while SAMD excels with specialised encoders. The findings suggest that LSCD methodologies could benefit from incorporating alternative semantic change metrics beyond the traditional APD and PRT.
Key Points
- ▸ Introduction of new metrics (AMD and SAMD) for LSCD.
- ▸ AMD shows robust performance under various conditions.
- ▸ SAMD excels with specialised encoders.
- ▸ Recommendation to consider alternative metrics for LSCD.
Merits
Innovative Metrics
The introduction of AMD and SAMD provides new tools for measuring semantic change, potentially improving the accuracy and robustness of LSCD.
Comprehensive Evaluation
The study evaluates the new metrics across multiple languages, encoder models, and representation spaces, providing a thorough assessment of their performance.
Practical Implications
The findings offer practical insights for researchers and practitioners in the field of natural language processing and computational linguistics.
Demerits
Limited Scope
The study focuses primarily on contextualised language model embeddings, which may limit the generalisability of the findings to other types of semantic change detection methods.
Complexity of Metrics
The introduction of new metrics adds complexity to the LSCD process, which may require additional computational resources and expertise to implement effectively.
Potential Bias
The performance of AMD and SAMD may be influenced by the choice of encoder models and representation spaces, which could introduce bias into the results.
Expert Commentary
The article 'Rethinking Metrics for Lexical Semantic Change Detection' presents a significant advancement in the field of lexical semantic change detection by introducing two novel metrics, AMD and SAMD. The study's rigorous evaluation across multiple languages, encoder models, and representation spaces underscores the robustness and potential of these new metrics. The findings suggest that AMD is particularly effective under conditions of dimensionality reduction and with non-specialised encoders, while SAMD performs exceptionally well with specialised encoders. This differentiation highlights the importance of selecting appropriate metrics based on the specific context and requirements of the analysis. The study's comprehensive approach and innovative contributions offer valuable insights for researchers and practitioners in natural language processing and computational linguistics. However, the complexity introduced by these new metrics and the potential for bias in their performance warrant further investigation. Overall, the article provides a compelling case for reconsidering the traditional reliance on APD and PRT and embracing a more nuanced and context-aware approach to lexical semantic change detection.
Recommendations
- ✓ Further research should explore the applicability of AMD and SAMD to other types of semantic change detection methods beyond contextualised language model embeddings.
- ✓ Future studies should investigate the potential biases introduced by different encoder models and representation spaces and develop strategies to mitigate these biases.