Academic

Avoiding Over-smoothing in Social Media Rumor Detection with Pre-trained Propagation Tree Transformer

arXiv:2603.22854v1 Announce Type: new Abstract: Deep learning techniques for rumor detection typically utilize Graph Neural Networks (GNNs) to analyze post relations. These methods, however, falter due to over-smoothing issues when processing rumor propagation structures, leading to declining performance. Our investigation into this issue reveals that over-smoothing is intrinsically tied to the structural characteristics of rumor propagation trees, in which the majority of nodes are 1-level nodes. Furthermore, GNNs struggle to capture long-range dependencies within these trees. To circumvent these challenges, we propose a Pre-Trained Propagation Tree Transformer (P2T3) method based on pure Transformer architecture. It extracts all conversation chains from a tree structure following the propagation direction of replies, utilizes token-wise embedding to infuse connection information and introduces necessary inductive bias, and pre-trains on large-scale unlabeled datasets. Experiments in

Chaoqun Cui, Caiyan Jia · March 25, 2026 · 1 min read · 5 views

#cs.CL #cs.AI

Executive Summary

The article addresses a critical limitation in current rumor detection models—over-smoothing caused by the structural constraints of rumor propagation trees, particularly due to the dominance of 1-level nodes and inability of GNNs to capture long-range dependencies. The authors propose P2T3, a Transformer-based framework that mitigates these issues by structurally reorienting propagation chains via reply direction, embedding token-wise connection information, and leveraging pre-training on large unlabeled datasets. Experimental results demonstrate superior performance across multiple benchmarks and robustness in few-shot scenarios, effectively circumventing the over-smoothing problem inherent in GNNs. This work offers a novel architectural solution with potential broader implications for multi-modal social media analysis.

Key Points

▸ Over-smoothing is rooted in structural properties of rumor propagation trees, particularly 1-level node dominance.
▸ GNNs struggle with long-range dependency capture in these structures.
▸ P2T3 introduces a Transformer-based architecture with pre-training and directional propagation chain extraction to overcome these limitations.

Merits

Architectural Innovation

P2T3’s design directly targets the root cause of over-smoothing by leveraging Transformer mechanics and directional propagation context, offering a more scalable and accurate alternative to GNNs.

Demerits

Limited Scope

While effective for propagation trees, the study does not extend validation to multimodal or heterogeneous graph scenarios, potentially limiting applicability beyond tree-structured rumor networks.

Expert Commentary

This paper makes a significant theoretical contribution by reframing the over-smoothing problem not as a computational artifact but as a structural misalignment between GNNs and rumor propagation topologies. The authors astutely identify that the prevalence of 1-level nodes in tree-like propagation networks fundamentally undermines GNN capacity to distinguish hierarchical nuance, a point often overlooked in prior literature. Their solution—pre-training on unlabeled data combined with directional propagation chain extraction—is elegant in its simplicity and effectiveness. Moreover, the choice of Transformer architecture aligns with contemporary trends in NLP, suggesting a convergence between social media analytics and mainstream AI models. Notably, the potential for extension into multi-modal domains opens new avenues for cross-modal rumor detection, such as integrating textual, visual, and metadata signals. While the absence of empirical validation on heterogeneous graphs is a minor limitation, the core insight and experimental validation are compelling. This work sets a new benchmark for rumor detection and warrants replication in cross-platform datasets to assess generalizability.

Recommendations

✓ 1. Platform operators should consider integrating P2T3 into their rumor detection pipelines as a primary or complementary module.
✓ 2. Future research should extend P2T3’s architecture to multimodal graph structures, particularly those combining text, image, and user interaction data, to assess scalability and broader applicability.

Sources

Original: arXiv - cs.CL

arXiv - cs.CL

Avoiding Over-smoothing in Social Media Rumor Detection with Pre-trained Propagation Tree Transformer

AI Commentary

Executive Summary

Key Points

Merits

Architectural Innovation

Demerits

Limited Scope

Expert Commentary

Recommendations

Sources

Related Articles

Cross-subject Muscle Fatigue Detection via Adversarial and Supervised Contrastive Learning …

A Numerical Method for Coupling Parameterized Physics-Informed Neural Networks and …

Low-Rank Compression of Pretrained Models via Randomized Subspace Iteration

Product-Stability: Provable Convergence for Gradient Descent on the Edge of …

JCG, PC

HSOLLC Co., Ltd.