Avoiding Over-smoothing in Social Media Rumor Detection with Pre-trained Propagation Tree Transformer
arXiv:2603.22854v1 Announce Type: new Abstract: Deep learning techniques for rumor detection typically utilize Graph Neural Networks (GNNs) to analyze post relations. These methods, however, falter due to over-smoothing issues when processing rumor propagation structures, leading to declining performance. Our investigation into this issue reveals that over-smoothing is intrinsically tied to the structural characteristics of rumor propagation trees, in which the majority of nodes are 1-level nodes. Furthermore, GNNs struggle to capture long-range dependencies within these trees. To circumvent these challenges, we propose a Pre-Trained Propagation Tree Transformer (P2T3) method based on pure Transformer architecture. It extracts all conversation chains from a tree structure following the propagation direction of replies, utilizes token-wise embedding to infuse connection information and introduces necessary inductive bias, and pre-trains on large-scale unlabeled datasets. Experiments in
arXiv:2603.22854v1 Announce Type: new Abstract: Deep learning techniques for rumor detection typically utilize Graph Neural Networks (GNNs) to analyze post relations. These methods, however, falter due to over-smoothing issues when processing rumor propagation structures, leading to declining performance. Our investigation into this issue reveals that over-smoothing is intrinsically tied to the structural characteristics of rumor propagation trees, in which the majority of nodes are 1-level nodes. Furthermore, GNNs struggle to capture long-range dependencies within these trees. To circumvent these challenges, we propose a Pre-Trained Propagation Tree Transformer (P2T3) method based on pure Transformer architecture. It extracts all conversation chains from a tree structure following the propagation direction of replies, utilizes token-wise embedding to infuse connection information and introduces necessary inductive bias, and pre-trains on large-scale unlabeled datasets. Experiments indicate that P2T3 surpasses previous state-of-the-art methods in multiple benchmark datasets and performs well under few-shot conditions. P2T3 not only avoids the over-smoothing issue inherent in GNNs but also potentially offers a large model or unified multi-modal scheme for future social media research.
Executive Summary
The article addresses a critical limitation in current rumor detection models—over-smoothing caused by the structural constraints of rumor propagation trees, particularly due to the dominance of 1-level nodes and inability of GNNs to capture long-range dependencies. The authors propose P2T3, a Transformer-based framework that mitigates these issues by structurally reorienting propagation chains via reply direction, embedding token-wise connection information, and leveraging pre-training on large unlabeled datasets. Experimental results demonstrate superior performance across multiple benchmarks and robustness in few-shot scenarios, effectively circumventing the over-smoothing problem inherent in GNNs. This work offers a novel architectural solution with potential broader implications for multi-modal social media analysis.
Key Points
- ▸ Over-smoothing is rooted in structural properties of rumor propagation trees, particularly 1-level node dominance.
- ▸ GNNs struggle with long-range dependency capture in these structures.
- ▸ P2T3 introduces a Transformer-based architecture with pre-training and directional propagation chain extraction to overcome these limitations.
Merits
Architectural Innovation
P2T3’s design directly targets the root cause of over-smoothing by leveraging Transformer mechanics and directional propagation context, offering a more scalable and accurate alternative to GNNs.
Demerits
Limited Scope
While effective for propagation trees, the study does not extend validation to multimodal or heterogeneous graph scenarios, potentially limiting applicability beyond tree-structured rumor networks.
Expert Commentary
This paper makes a significant theoretical contribution by reframing the over-smoothing problem not as a computational artifact but as a structural misalignment between GNNs and rumor propagation topologies. The authors astutely identify that the prevalence of 1-level nodes in tree-like propagation networks fundamentally undermines GNN capacity to distinguish hierarchical nuance, a point often overlooked in prior literature. Their solution—pre-training on unlabeled data combined with directional propagation chain extraction—is elegant in its simplicity and effectiveness. Moreover, the choice of Transformer architecture aligns with contemporary trends in NLP, suggesting a convergence between social media analytics and mainstream AI models. Notably, the potential for extension into multi-modal domains opens new avenues for cross-modal rumor detection, such as integrating textual, visual, and metadata signals. While the absence of empirical validation on heterogeneous graphs is a minor limitation, the core insight and experimental validation are compelling. This work sets a new benchmark for rumor detection and warrants replication in cross-platform datasets to assess generalizability.
Recommendations
- ✓ 1. Platform operators should consider integrating P2T3 into their rumor detection pipelines as a primary or complementary module.
- ✓ 2. Future research should extend P2T3’s architecture to multimodal graph structures, particularly those combining text, image, and user interaction data, to assess scalability and broader applicability.
Sources
Original: arXiv - cs.CL