Beyond Accuracy: An Explainability-Driven Analysis of Harmful Content Detection
arXiv:2603.18015v1 Announce Type: new Abstract: Although automated harmful content detection systems are frequently used to monitor online platforms, moderators and end users frequently cannot understand …
Trishita Dhara, Siddhesh Sheth
25 views