Academic

From Perceptions To Evidence: Detecting AI-Generated Content In Turkish News Media With A Fine-Tuned Bert Classifier

arXiv:2602.13504v1 Announce Type: new Abstract: The rapid integration of large language models into newsroom workflows has raised urgent questions about the prevalence of AI-generated content in online media. While computational studies have begun to quantify this phenomenon in English-language outlets, no empirical investigation exists for Turkish news media, where existing research remains limited to qualitative interviews with journalists or fake news detection. This study addresses that gap by fine-tuning a Turkish-specific BERT model (dbmdz/bert-base-turkish-cased) on a labeled dataset of 3,600 articles from three major Turkish outlets with distinct editorial orientations for binary classification of AI-rewritten content. The model achieves 0.9708 F1 score on the held-out test set with symmetric precision and recall across both classes. Subsequent deployment on over 3,500 unseen articles spanning between 2023 and 2026 reveals consistent cross-source and temporally stable classifi

Ozancan Ozdemir · March 7, 2026 · 1 min read · 2 views

#cs.CL #cs.AI

Executive Summary

This study addresses the gap in empirical research on AI-generated content in Turkish news media by fine-tuning a Turkish-specific BERT model to detect AI-rewritten articles. The research achieves a high F1 score and reveals that approximately 2.5% of examined news content is AI-generated, marking the first data-driven measurement of AI usage in Turkish news media.

Key Points

▸ First empirical study on AI-generated content in Turkish news media.
▸ Fine-tuned Turkish BERT model achieves high accuracy in detecting AI-rewritten content.
▸ Approximately 2.5% of examined news content is estimated to be AI-generated.

Merits

Innovative Methodology

The study employs a fine-tuned BERT model specifically adapted for Turkish, providing a robust and accurate method for detecting AI-generated content.

High Accuracy

The model achieves an F1 score of 0.9708, demonstrating high precision and recall, which is crucial for reliable detection.

Comprehensive Dataset

The study uses a large and diverse dataset of 3,600 articles from three major Turkish outlets, ensuring the findings are representative and generalizable.

Demerits

Limited Scope

The study focuses on only three Turkish news outlets, which may not fully capture the diversity of AI usage across all Turkish media.

Temporal Limitations

The dataset spans from 2023 to 2026, which may not reflect current trends or future developments in AI-generated content.

Binary Classification

The model uses binary classification, which may oversimplify the nuances of AI-generated content and its integration into news media.

Expert Commentary

This study represents a significant advancement in the empirical measurement of AI-generated content in Turkish news media. The high accuracy of the fine-tuned BERT model provides a reliable method for detecting AI-rewritten articles, addressing a critical gap in the existing literature. The findings suggest that while AI-generated content is present, it is currently a minor component of the examined news media. However, the study's limitations, such as the focus on a limited number of outlets and the binary classification approach, highlight areas for future research. The implications of this study are profound, offering practical tools for media outlets and policymakers to ensure transparency and ethical use of AI in journalism. As AI continues to integrate into newsroom workflows, such empirical studies will be crucial in guiding both industry practices and regulatory frameworks.

Recommendations

✓ Expand the study to include a broader range of Turkish news outlets to capture the full spectrum of AI usage in the media landscape.
✓ Develop more nuanced classification models that can differentiate between various degrees of AI involvement in content creation, rather than relying on binary classification.

Sources

arXiv - cs.CL

From Perceptions To Evidence: Detecting AI-Generated Content In Turkish News Media With A Fine-Tuned Bert Classifier

AI Commentary

Executive Summary

Key Points

Merits

Innovative Methodology

High Accuracy

Comprehensive Dataset

Demerits

Limited Scope

Temporal Limitations

Binary Classification

Expert Commentary

Recommendations

Sources

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs

JCG, PC

HSOLLC Co., Ltd.

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs