Academic

Click it or Leave it: Detecting and Spoiling Clickbait with Informativeness Measures and Large Language Models

arXiv:2602.18171v1 Announce Type: new Abstract: Clickbait headlines degrade the quality of online information and undermine user trust. We present a hybrid approach to clickbait detection that combines transformer-based text embeddings with linguistically motivated informativeness features. Using natural language processing techniques, we evaluate classical vectorizers, word embedding baselines, and large language model embeddings paired with tree-based classifiers. Our best-performing model, XGBoost over embeddings augmented with 15 explicit features, achieves an F1-score of 91\%, outperforming TF-IDF, Word2Vec, GloVe, LLM prompt based classification, and feature-only baselines. The proposed feature set enhances interpretability by highlighting salient linguistic cues such as second-person pronouns, superlatives, numerals, and attention-oriented punctuation, enabling transparent and well-calibrated clickbait predictions. We release code and trained models to support reproducible rese

Wojciech Michaluk, Tymoteusz Urban, Mateusz Kubita, Soveatin Kuntur, Anna Wroblewska · February 24, 2026 · 1 min read · 4 views

#cs.CL #cs.AI

Executive Summary

This article proposes a hybrid approach to detecting clickbait headlines by combining transformer-based text embeddings with linguistically motivated informativeness features. The model achieves an F1-score of 91%, outperforming traditional baselines. The proposed feature set enhances interpretability by highlighting salient linguistic cues, enabling transparent and well-calibrated clickbait predictions. The study contributes to the development of more effective clickbait detection methods, improving online information quality and user trust.

Key Points

▸ Hybrid approach combining transformer-based text embeddings and linguistically motivated informativeness features
▸ Achieves an F1-score of 91%, outperforming traditional baselines
▸ Proposed feature set enhances interpretability by highlighting salient linguistic cues

Merits

High Accuracy

The proposed model achieves a high F1-score of 91%, indicating its effectiveness in detecting clickbait headlines

Interpretability

The proposed feature set provides insights into the linguistic cues that contribute to clickbait detection, enhancing model transparency

Demerits

Limited Generalizability

The study's results may not generalize to other domains or datasets, highlighting the need for further testing and validation

Dependence on Large Language Models

The proposed approach relies on large language models, which can be computationally expensive and require significant resources

Expert Commentary

The proposed hybrid approach to clickbait detection represents a significant advancement in the field, leveraging the strengths of both transformer-based text embeddings and linguistically motivated informativeness features. The study's emphasis on interpretability is particularly noteworthy, as it enables a deeper understanding of the linguistic cues that contribute to clickbait detection. However, further research is needed to address the limitations of the approach, including its dependence on large language models and potential lack of generalizability to other domains.

Recommendations

✓ Further testing and validation of the proposed approach on diverse datasets and domains
✓ Exploration of alternative approaches that can reduce the computational expenses associated with large language models

Sources

arXiv - cs.CL

Something extraordinary is coming.

Click it or Leave it: Detecting and Spoiling Clickbait with Informativeness Measures and Large Language Models

AI Commentary

Executive Summary

Key Points

Merits

High Accuracy

Interpretability

Demerits

Limited Generalizability

Dependence on Large Language Models

Expert Commentary

Recommendations

Sources

Related Articles

Humans and LLMs Diverge on Probabilistic Inferences

France or Spain or Germany or France: A Neural Account …

Multi-Agent Causal Reasoning for Suicide Ideation Detection Through Online Conversations

BRIDGE the Gap: Mitigating Bias Amplification in Automated Scoring of …

JCG, PC

HSOLLC Co., Ltd.