AIWizards at MULTIPRIDE: A Hierarchical Approach to Slur Reclamation Detection
arXiv:2602.12818v1 Announce Type: new Abstract: Detecting reclaimed slurs represents a fundamental challenge for hate speech detection systems, as the same lexcal items can function either as abusive expressions or as in-group affirmations depending on social identity and context. In this work, we address Subtask B of the MultiPRIDE shared task at EVALITA 2026 by proposing a hierarchical approach to modeling the slur reclamation process. Our core assumption is that members of the LGBTQ+ community are more likely, on average, to employ certain slurs in a eclamatory manner. Based on this hypothesis, we decompose the task into two stages. First, using a weakly supervised LLM-based annotation, we assign fuzzy labels to users indicating the likelihood of belonging to the LGBTQ+ community, inferred from the tweet and the user bio. These soft labels are then used to train a BERT-like model to predict community membership, encouraging the model to learn latent representations associated with
arXiv:2602.12818v1 Announce Type: new Abstract: Detecting reclaimed slurs represents a fundamental challenge for hate speech detection systems, as the same lexcal items can function either as abusive expressions or as in-group affirmations depending on social identity and context. In this work, we address Subtask B of the MultiPRIDE shared task at EVALITA 2026 by proposing a hierarchical approach to modeling the slur reclamation process. Our core assumption is that members of the LGBTQ+ community are more likely, on average, to employ certain slurs in a eclamatory manner. Based on this hypothesis, we decompose the task into two stages. First, using a weakly supervised LLM-based annotation, we assign fuzzy labels to users indicating the likelihood of belonging to the LGBTQ+ community, inferred from the tweet and the user bio. These soft labels are then used to train a BERT-like model to predict community membership, encouraging the model to learn latent representations associated with LGBTQ+ identity. In the second stage, we integrate this latent space with a newly initialized model for the downstream slur reclamation detection task. The intuition is that the first model encodes user-oriented sociolinguistic signals, which are then fused with representations learned by a model pretrained for hate speech detection. Experimental results on Italian and Spanish show that our approach achieves performance statistically comparable to a strong BERT-based baseline, while providing a modular and extensible framework for incorporating sociolinguistic context into hate speech modeling. We argue that more fine-grained hierarchical modeling of user identity and discourse context may further improve the detection of reclaimed language. We release our code at https://github.com/LucaTedeschini/multipride.
Executive Summary
The article 'AIWizards at MULTIPRIDE: A Hierarchical Approach to Slur Reclamation Detection' presents a novel hierarchical framework for detecting reclaimed slurs in hate speech detection systems. The authors address the challenge of distinguishing between abusive expressions and in-group affirmations by leveraging sociolinguistic context. Their approach involves a two-stage process: first, predicting LGBTQ+ community membership using weakly supervised LLM-based annotation, and second, integrating this information with a hate speech detection model. The study demonstrates statistically comparable performance to a strong BERT-based baseline on Italian and Spanish datasets, advocating for more fine-grained modeling of user identity and discourse context.
Key Points
- ▸ Hierarchical approach to slur reclamation detection
- ▸ Two-stage process involving community membership prediction and hate speech detection
- ▸ Use of weakly supervised LLM-based annotation for user labeling
- ▸ Integration of sociolinguistic context into hate speech modeling
- ▸ Performance comparable to BERT-based baseline on Italian and Spanish datasets
Merits
Innovative Framework
The hierarchical approach offers a modular and extensible framework for incorporating sociolinguistic context, which is crucial for accurate hate speech detection.
Weakly Supervised Annotation
The use of weakly supervised LLM-based annotation provides a scalable and efficient method for labeling user data, reducing the need for extensive manual annotation.
Comparable Performance
The approach achieves performance statistically comparable to a strong BERT-based baseline, demonstrating its effectiveness in real-world applications.
Demerits
Limited Language Coverage
The study is limited to Italian and Spanish datasets, which may not generalize to other languages or cultural contexts.
Potential Bias in Annotation
The weakly supervised LLM-based annotation may introduce biases, particularly if the training data is not representative of the diverse LGBTQ+ community.
Complexity of Implementation
The hierarchical approach adds complexity to the model, which may require significant computational resources and expertise for implementation.
Expert Commentary
The article presents a significant advancement in the field of hate speech detection by addressing the complex issue of slur reclamation. The hierarchical approach, which integrates sociolinguistic context, offers a promising framework for improving the accuracy and fairness of detection systems. The use of weakly supervised LLM-based annotation is particularly noteworthy, as it provides a scalable and efficient method for labeling user data. However, the study's limitations, such as the potential for bias in annotation and the complexity of implementation, should be carefully considered. Future research should focus on expanding the approach to other languages and cultural contexts, as well as addressing the ethical implications of AI-based hate speech detection. Overall, the study contributes valuable insights to the ongoing efforts to develop more nuanced and context-aware hate speech detection models.
Recommendations
- ✓ Expand the study to include a broader range of languages and cultural contexts to enhance the generalizability of the findings.
- ✓ Conduct further research on the ethical implications of AI-based hate speech detection, particularly in relation to marginalized communities.
- ✓ Develop more robust and unbiased annotation methods to ensure the accuracy and fairness of the detection systems.