Academic

LLM-Augmented Knowledge Base Construction For Root Cause Analysis

arXiv:2604.06171v1 Announce Type: new Abstract: Communications networks now form the backbone of our digital world, with fast and reliable connectivity. However, even with appropriate redundancy and failover mechanisms, it is difficult to guarantee "five 9s" (99.999 %) reliability, requiring rapid and accurate root cause analysis (RCA) during outages. In the event of an outage, rapid and accurate RCA becomes essential to restore service and prevent future disruptions. This study evaluates three Large Language Model (LLM) methodologies - Fine-Tuning, RAG, and a Hybrid approach - for constructing a Root Cause Analysis (RCA) Knowledge Base from support tickets. We compare their performance using a comprehensive suite of lexical and semantic similarity metrics. Our experiments on a real industrial dataset demonstrate that the generated knowledge base provides an excellent starting point for accelerating RCA tasks and improving network resilience.

Nguyen Phuc Tran, Brigitte Jaumard, Oscar Delgado, Tristan Glatard, Karthikeyan Premkumar, Kun Ni · April 9, 2026 · 1 min read · 3 views

#cs.CL #cs.AI

Executive Summary

This article explores the application of Large Language Models (LLMs) to automate and enhance Root Cause Analysis (RCA) in communications networks, a critical task for maintaining high service reliability. Focusing on constructing an RCA knowledge base from support tickets, the study systematically evaluates three LLM methodologies: fine-tuning, Retrieval-Augmented Generation (RAG), and a hybrid approach. Using a real-world industrial dataset, the authors compare these methods through a suite of lexical and semantic similarity metrics. The findings suggest that LLM-generated knowledge bases offer a robust foundation for significantly accelerating RCA processes, thereby contributing to improved network resilience and operational efficiency in complex digital infrastructures.

Key Points

▸ The study investigates three LLM methodologies (Fine-Tuning, RAG, Hybrid) for RCA knowledge base construction.
▸ The primary data source for knowledge base creation is real-world industrial support tickets related to network outages.
▸ Performance evaluation employs a comprehensive suite of lexical and semantic similarity metrics.
▸ The generated knowledge base is shown to provide an excellent starting point for accelerating RCA tasks.
▸ The ultimate goal is to improve network resilience by expediting service restoration during outages.

Merits

Pragmatic Problem Focus

Addresses a critical, real-world operational challenge in network reliability, directly impacting 'five 9s' service availability.

Methodological Comparison

Provides a valuable comparative analysis of distinct LLM paradigms (Fine-Tuning, RAG, Hybrid) for a specific application, offering insights into their relative strengths.

Industrial Dataset Validation

The use of a 'real industrial dataset' significantly enhances the credibility and generalizability of the findings, moving beyond theoretical exercises.

Comprehensive Evaluation Metrics

Employing both lexical and semantic similarity metrics offers a nuanced and robust assessment of the knowledge base quality.

Demerits

Lack of Specific LLM Models

The abstract does not specify which LLM architectures (e.g., GPT-3.5, Llama, BERT variants) were used, which is crucial for reproducibility and deeper analysis of results.

Granularity of 'Support Tickets'

The abstract does not detail the structure, typical content, or pre-processing steps for the 'support tickets', which can heavily influence LLM performance.

Definition of 'Knowledge Base'

A clearer definition of the 'RCA Knowledge Base' structure (e.g., rule-based, graph-based, semantic triples) generated by the LLMs would be beneficial.

Absence of Human Baseline

While comparing LLM methods is useful, the absence of a human expert baseline for RCA knowledge base construction makes it harder to contextualize the 'excellent starting point' claim.

Expert Commentary

This study offers a compelling insight into the transformative potential of LLMs in a domain critical for modern digital economies: maintaining the robustness of communications networks. The comparative analysis of fine-tuning, RAG, and hybrid approaches against industrial data is commendable, moving beyond theoretical discussions to practical validation. While the abstract lacks granular detail on the specific LLMs used and the precise structure of the 'knowledge base,' which would be crucial for replication and deeper academic scrutiny, its emphasis on accelerating RCA tasks and improving resilience resonates strongly. The 'excellent starting point' claim, if substantiated by further research comparing against human expert performance, could herald a paradigm shift in incident management. Future work should explicitly address the interpretability of LLM outputs in RCA, the security of sensitive operational data, and the evolving regulatory landscape surrounding AI deployment in critical infrastructure. The legal and ethical implications of delegating such critical analytical tasks to AI warrant significant attention.

Recommendations

✓ Detail the specific LLM models, architectures, and hyper-parameters used for each methodology to ensure reproducibility and facilitate comparative research.
✓ Provide a clearer definition and examples of the 'RCA Knowledge Base' structure generated by the LLMs (e.g., semantic graphs, rule sets, annotated text).
✓ Include a quantitative or qualitative comparison of LLM-generated RCA knowledge bases against a human expert-curated baseline for more robust validation.
✓ Address the ethical considerations, data privacy implications, and potential for bias in LLM-augmented RCA, especially concerning sensitive industrial data.
✓ Explore the integration of explainable AI (XAI) techniques to provide transparency into LLM-derived root causes, enhancing trust and auditability in critical applications.

Sources

Original: arXiv - cs.CL

arXiv - cs.CL

LLM-Augmented Knowledge Base Construction For Root Cause Analysis

AI Commentary

Executive Summary

Key Points

Merits

Pragmatic Problem Focus

Methodological Comparison

Industrial Dataset Validation

Comprehensive Evaluation Metrics

Demerits

Lack of Specific LLM Models

Granularity of 'Support Tickets'

Definition of 'Knowledge Base'

Absence of Human Baseline

Expert Commentary

Recommendations

Sources

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs

JCG, PC

HSOLLC Co., Ltd.

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs