Academic

Multiclass Hate Speech Detection with RoBERTa-OTA: Integrating Transformer Attention and Graph Convolutional Networks

Mahmoud Abusaqer, Jamil Saquer · March 7, 2026 · 1 min read · 2 views

#cs.CL

arXiv:2603.04414v1 Announce Type: new Abstract: Multiclass hate speech detection across demographic categories remains computationally challenging due to implicit targeting strategies and linguistic variability in social media content. Existing approaches rely solely on learned representations from training data, without explicitly incorporating structured ontological frameworks that can enhance classification through formal domain knowledge integration. We propose RoBERTa-OTA, which introduces ontology-guided attention mechanisms that process textual features alongside structured knowledge representations through enhanced Graph Convolutional Networks. The architecture combines RoBERTa embeddings with scaled attention layers and graph neural networks to integrate contextual language understanding with domain-specific semantic knowledge. Evaluation across 39,747 balanced samples using 5-fold cross-validation demonstrates significant performance gains over baseline RoBERTa implementations and existing state-of-the-art methods. RoBERTa-OTA achieves 96.04\% accuracy compared to 95.02\% for standard RoBERTa, with substantial improvements for challenging categories: gender-based hate speech detection improves by 2.36 percentage points while other hate speech categories improve by 2.38 percentage points. The enhanced architecture maintains computational efficiency with only 0.33\% parameter overhead, providing practical advantages for large-scale content moderation applications requiring fine-grained demographic hate speech classification.

Executive Summary

This article proposes RoBERTa-OTA, a novel architecture for multiclass hate speech detection that integrates transformer attention and graph convolutional networks. By combining RoBERTa embeddings with scaled attention layers and graph neural networks, RoBERTa-OTA enhances contextual language understanding with domain-specific semantic knowledge. The architecture demonstrates significant performance gains over baseline RoBERTa implementations and existing state-of-the-art methods, achieving 96.04% accuracy with substantial improvements for challenging categories. The enhanced architecture maintains computational efficiency with only 0.33% parameter overhead, providing practical advantages for large-scale content moderation applications. The proposed approach has the potential to improve the accuracy and efficiency of hate speech detection in social media content, which is crucial for maintaining online safety and preventing cyberbullying.

Key Points

▸ Multiclass hate speech detection is computationally challenging due to implicit targeting strategies and linguistic variability in social media content.
▸ RoBERTa-OTA integrates transformer attention and graph convolutional networks to enhance contextual language understanding with domain-specific semantic knowledge.
▸ The proposed architecture demonstrates significant performance gains over baseline RoBERTa implementations and existing state-of-the-art methods.

Merits

Strength in Domain Knowledge Integration

RoBERTa-OTA explicitly incorporates structured ontological frameworks that can enhance classification through formal domain knowledge integration, addressing a significant limitation of existing approaches.

Improved Accuracy and Efficiency

The enhanced architecture achieves 96.04% accuracy with substantial improvements for challenging categories while maintaining computational efficiency with only 0.33% parameter overhead.

Demerits

Limited Generalizability

The proposed approach may not generalize well to other domains or languages due to its reliance on structured ontological frameworks and domain-specific semantic knowledge.

Expert Commentary

The proposed approach in RoBERTa-OTA demonstrates significant improvements in multiclass hate speech detection, addressing a critical challenge in the field of natural language processing. The integration of transformer attention and graph convolutional networks enables the enhanced architecture to capture contextual language understanding and domain-specific semantic knowledge, achieving 96.04% accuracy with substantial improvements for challenging categories. However, the approach relies on structured ontological frameworks and domain-specific semantic knowledge, which may limit its generalizability to other domains or languages. Nevertheless, the proposed approach has the potential to improve the accuracy and efficiency of hate speech detection in social media content, making it a valuable contribution to the field.

Recommendations

✓ Future research should focus on adapting the proposed approach to other domains and languages, addressing the limitations of generalizability.
✓ The integration of RoBERTa-OTA with other natural language processing techniques, such as emotional analysis and fake news detection, could further enhance its capabilities and applications.

Sources

arXiv - cs.CL

Multiclass Hate Speech Detection with RoBERTa-OTA: Integrating Transformer Attention and Graph Convolutional Networks

AI Commentary

Executive Summary

Key Points

Merits

Strength in Domain Knowledge Integration

Improved Accuracy and Efficiency

Demerits

Limited Generalizability

Expert Commentary

Recommendations

Sources

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs

JCG, PC

HSOLLC Co., Ltd.

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs