Academic

TRIZ-RAGNER: A Retrieval-Augmented Large Language Model for TRIZ-Aware Named Entity Recognition in Patent-Based Contradiction Mining

arXiv:2602.23656v1 Announce Type: new Abstract: TRIZ-based contradiction mining is a fundamental task in patent analysis and systematic innovation, as it enables the identification of improving and worsening technical parameters that drive inventive problem solving. However, existing approaches largely rely on rule-based systems or traditional machine learning models, which struggle with semantic ambiguity, domain dependency, and limited generalization when processing complex patent language. Recently, large language models (LLMs) have shown strong semantic understanding capabilities, yet their direct application to TRIZ parameter extraction remains challenging due to hallucination and insufficient grounding in structured TRIZ knowledge. To address these limitations, this paper proposes TRIZ-RAGNER, a retrieval-augmented large language model framework for TRIZ-aware named entity recognition in patent-based contradiction mining. TRIZ-RAGNER reformulates contradiction mining as a semant

Z
Zitong Xu, Yuqing Wu, Yue Zhao
· · 1 min read · 13 views

arXiv:2602.23656v1 Announce Type: new Abstract: TRIZ-based contradiction mining is a fundamental task in patent analysis and systematic innovation, as it enables the identification of improving and worsening technical parameters that drive inventive problem solving. However, existing approaches largely rely on rule-based systems or traditional machine learning models, which struggle with semantic ambiguity, domain dependency, and limited generalization when processing complex patent language. Recently, large language models (LLMs) have shown strong semantic understanding capabilities, yet their direct application to TRIZ parameter extraction remains challenging due to hallucination and insufficient grounding in structured TRIZ knowledge. To address these limitations, this paper proposes TRIZ-RAGNER, a retrieval-augmented large language model framework for TRIZ-aware named entity recognition in patent-based contradiction mining. TRIZ-RAGNER reformulates contradiction mining as a semantic-level NER task and integrates dense retrieval over a TRIZ knowledge base, cross-encoder reranking for context refinement, and structured LLM prompting to extract improving and worsening parameters from patent sentences. By injecting domain-specific TRIZ knowledge into the LLM reasoning process, the proposed framework effectively reduces semantic noise and improves extraction consistency. Experiments on the PaTRIZ dataset demonstrate that TRIZ-RAGNER consistently outperforms traditional sequence labeling models and LLM-based baselines. The proposed framework achieves a precision of 85.6%, a recall of 82.9%, and an F1-score of 84.2% in TRIZ contradiction pair identification. Compared with the strongest baseline using prompt-enhanced GPT, TRIZ-RAGNER yields an absolute F1-score improvement of 7.3 percentage points, confirming the effectiveness of retrieval-augmented TRIZ knowledge grounding for robust and accurate patent-based contradiction mining.

Executive Summary

This paper presents TRIZ-RAGNER, a retrieval-augmented large language model framework for TRIZ-aware named entity recognition in patent-based contradiction mining. TRIZ-RAGNER reformulates contradiction mining as a semantic-level NER task, integrating dense retrieval over a TRIZ knowledge base, cross-encoder reranking for context refinement, and structured LLM prompting to extract improving and worsening parameters from patent sentences. The proposed framework effectively reduces semantic noise and improves extraction consistency, outperforming traditional sequence labeling models and LLM-based baselines on the PaTRIZ dataset. The results demonstrate the effectiveness of retrieval-augmented TRIZ knowledge grounding for robust and accurate patent-based contradiction mining, with a precision of 85.6%, a recall of 82.9%, and an F1-score of 84.2%.

Key Points

  • TRIZ-RAGNER is a retrieval-augmented large language model framework for TRIZ-aware named entity recognition in patent-based contradiction mining.
  • The framework integrates dense retrieval over a TRIZ knowledge base, cross-encoder reranking for context refinement, and structured LLM prompting.
  • TRIZ-RAGNER outperforms traditional sequence labeling models and LLM-based baselines on the PaTRIZ dataset.

Merits

Strength in Addressing Semantic Ambiguity

The proposed framework effectively reduces semantic noise and improves extraction consistency by integrating dense retrieval over a TRIZ knowledge base.

Improved Extraction Consistency

TRIZ-RAGNER achieves high precision, recall, and F1-score in TRIZ contradiction pair identification, demonstrating its effectiveness in robust and accurate patent-based contradiction mining.

Effective Knowledge Grounding

The retrieval-augmented TRIZ knowledge grounding in TRIZ-RAGNER enables the model to accurately extract improving and worsening parameters from patent sentences.

Demerits

Limited Generalizability

The framework may struggle with generalization to other domains or datasets, as it is specifically designed for patent-based contradiction mining.

High Computational Requirements

The dense retrieval and cross-encoder reranking components may require significant computational resources, potentially limiting the framework's scalability.

Expert Commentary

The proposed framework, TRIZ-RAGNER, demonstrates a significant improvement in TRIZ-aware named entity recognition and patent-based contradiction mining. By integrating dense retrieval over a TRIZ knowledge base, cross-encoder reranking for context refinement, and structured LLM prompting, the framework effectively reduces semantic noise and improves extraction consistency. However, the high computational requirements and limited generalizability of the framework may pose challenges for its widespread adoption. Nevertheless, the use of retrieval-augmented TRIZ knowledge grounding in TRIZ-RAGNER highlights the importance of knowledge grounding in large language models for robust and accurate performance.

Recommendations

  • Further research should focus on developing more efficient and scalable versions of the framework, addressing the high computational requirements and limited generalizability.
  • The use of retrieval-augmented TRIZ knowledge grounding should be extended to other domains, such as natural language processing and machine learning, to improve model performance and robustness.

Sources