Academic

Tokenization, Fusion and Decoupling: Bridging the Granularity Mismatch Between Large Language Models and Knowledge Graphs

arXiv:2602.22698v1 Announce Type: new Abstract: Leveraging Large Language Models (LLMs) for Knowledge Graph Completion (KGC) is promising but hindered by a fundamental granularity mismatch. LLMs operate on fragmented token sequences, whereas entities are the fundamental units in knowledge graphs (KGs) scenarios. Existing approaches typically constrain predictions to limited candidate sets or align entities with the LLM's vocabulary by pooling multiple tokens or decomposing entities into fixed-length token sequences, which fail to capture both the semantic meaning of the text and the structural integrity of the graph. To address this, we propose KGT, a novel framework that uses dedicated entity tokens to enable efficient, full-space prediction. Specifically, we first introduce specialized tokenization to construct feature representations at the level of dedicated entity tokens. We then fuse pre-trained structural and textual features into these unified embeddings via a relation-guided

Siyue Su, Jian Yang, Bo Li, Guanglin Niu · February 28, 2026 · 1 min read · 9 views

#cs.CL #cs.AI

Executive Summary

This article proposes a novel framework, KGT, to address the granularity mismatch between Large Language Models (LLMs) and Knowledge Graphs (KGs). KGT uses dedicated entity tokens to enable efficient, full-space prediction, and fuses pre-trained structural and textual features via a relation-guided gating mechanism. Experimental results demonstrate KGT's superiority over state-of-the-art methods across multiple benchmarks. The framework's ability to capture both semantic meaning and structural integrity of the graph is a significant advancement in the field of KGC. However, the approach's reliance on pre-trained models and its potential scalability limitations are areas that require further exploration. The article's findings have significant implications for the development of more effective and efficient KGC systems, and highlight the importance of addressing the granularity mismatch between LLMs and KGs.

Key Points

▸ KGT addresses the granularity mismatch between LLMs and KGs by using dedicated entity tokens.
▸ The framework fuses pre-trained structural and textual features via a relation-guided gating mechanism.
▸ Independent heads are used for decoupled prediction to separate and combine semantic and structural reasoning.

Merits

Strength in Addressing Granularity Mismatch

KGT's use of dedicated entity tokens and relation-guided gating mechanism effectively addresses the granularity mismatch between LLMs and KGs, enabling more accurate and efficient knowledge graph completion.

Flexibility in Feature Fusion

The framework's ability to fuse pre-trained structural and textual features via a relation-guided gating mechanism allows for flexible and effective combination of different knowledge sources.

Improved Prediction Accuracy

KGT's decoupled prediction approach using independent heads enables more accurate separation and combination of semantic and structural reasoning, leading to improved prediction accuracy.

Demerits

Reliance on Pre-trained Models

KGT's reliance on pre-trained models may limit its scalability and generalizeability to new domains and tasks.

Potential Scalability Limitations

The framework's use of dedicated entity tokens and relation-guided gating mechanism may lead to increased computational complexity and memory requirements, particularly for large-scale knowledge graphs.

Expert Commentary

The article's contribution to the field of KGC is substantial, and the proposed framework, KGT, demonstrates a significant advancement in addressing the granularity mismatch between LLMs and KGs. However, the reliance on pre-trained models and potential scalability limitations require further exploration. The article's findings have significant implications for the development of more effective and efficient KGC systems, and highlight the importance of addressing the granularity mismatch between LLMs and KGs. Future research should focus on addressing these limitations and exploring the potential applications of KGT in real-world scenarios.

Recommendations

✓ Further investigation into the scalability and generalizeability of KGT is necessary to ensure its widespread adoption in real-world applications.
✓ The development of more effective and efficient KGC systems has significant practical implications for applications such as question answering, recommender systems, and natural language processing.

Sources

arXiv - cs.CL

Something extraordinary is coming.

Tokenization, Fusion and Decoupling: Bridging the Granularity Mismatch Between Large Language Models and Knowledge Graphs

AI Commentary

Executive Summary

Key Points

Merits

Strength in Addressing Granularity Mismatch

Flexibility in Feature Fusion

Improved Prediction Accuracy

Demerits

Reliance on Pre-trained Models

Potential Scalability Limitations

Expert Commentary

Recommendations

Sources

Related Articles

Uncovering Context Reliance in Unstructured Knowledge Editing

Using AI in Dance Notation and Copyright Infringement Prevention: Enhancing …

Multilevel Determinants of Overweight and Obesity Among U.S. Children Aged …

An artificial intelligence framework for end-to-end rare disease phenotyping from …

JCG, PC

HSOLLC Co., Ltd.