Academic

Graph Your Way to Inspiration: Integrating Co-Author Graphs with Retrieval-Augmented Generation for Large Language Model Based Scientific Idea Generation

Pengzhen Xie, Huizhi Liang · March 1, 2026 · 1 min read · 3 views

#cs.AI #cs.CL #cs.IR

arXiv:2602.22215v1 Announce Type: new Abstract: Large Language Models (LLMs) demonstrate potential in the field of scientific idea generation. However, the generated results often lack controllable academic context and traceable inspiration pathways. To bridge this gap, this paper proposes a scientific idea generation system called GYWI, which combines author knowledge graphs with retrieval-augmented generation (RAG) to form an external knowledge base to provide controllable context and trace of inspiration path for LLMs to generate new scientific ideas. We first propose an author-centered knowledge graph construction method and inspiration source sampling algorithms to construct external knowledge base. Then, we propose a hybrid retrieval mechanism that is composed of both RAG and GraphRAG to retrieve content with both depth and breadth knowledge. It forms a hybrid context. Thirdly, we propose a Prompt optimization strategy incorporating reinforcement learning principles to automatically guide LLMs optimizing the results based on the hybrid context. To evaluate the proposed approaches, we constructed an evaluation dataset based on arXiv (2018-2023). This paper also develops a comprehensive evaluation method including empirical automatic assessment in multiple-choice question task, LLM-based scoring, human evaluation, and semantic space visualization analysis. The generated ideas are evaluated from the following five dimensions: novelty, feasibility, clarity, relevance, and significance. We conducted experiments on different LLMs including GPT-4o, DeepSeek-V3, Qwen3-8B, and Gemini 2.5. Experimental results show that GYWI significantly outperforms mainstream LLMs in multiple metrics such as novelty, reliability, and relevance.

Executive Summary

This article proposes a novel approach to large language model (LLM)-based scientific idea generation, called GYWI, which integrates co-author graphs with retrieval-augmented generation. GYWI aims to provide controllable academic context and traceable inspiration pathways for LLMs. The system consists of three main components: author knowledge graph construction, hybrid retrieval mechanism, and prompt optimization strategy. The authors conduct extensive experiments on various LLMs, including GPT-4o, DeepSeek-V3, and Gemini 2.5, and demonstrate that GYWI outperforms mainstream LLMs in multiple metrics. The proposed approach has the potential to revolutionize scientific idea generation and may have significant implications for the field of artificial intelligence. The authors' evaluation framework, which includes multiple metrics and human evaluation, is comprehensive and rigorous.

Key Points

▸ GYWI integrates co-author graphs with retrieval-augmented generation for LLM-based scientific idea generation.
▸ The system consists of three main components: author knowledge graph construction, hybrid retrieval mechanism, and prompt optimization strategy.
▸ GYWI outperforms mainstream LLMs in multiple metrics, including novelty, reliability, and relevance.

Merits

Strength of Hybrid Retrieval Mechanism

The hybrid retrieval mechanism, which combines RAG and GraphRAG, allows for both depth and breadth knowledge retrieval, providing a comprehensive context for LLMs.

Comprehensive Evaluation Framework

The authors' evaluation framework, which includes multiple metrics and human evaluation, provides a rigorous assessment of the proposed approach.

Demerits

Limited Generalizability

The experiments were conducted on a specific dataset (arXiv, 2018-2023) and may not be generalizable to other domains or datasets.

Dependence on Pre-trained LLMs

The proposed approach relies on pre-trained LLMs, which may limit its scalability and adaptability to new tasks or domains.

Expert Commentary

The proposed approach, GYWI, is a significant contribution to the field of LLM-based scientific idea generation. The integration of co-author graphs with retrieval-augmented generation provides a novel and comprehensive context for LLMs, enabling them to generate more novel and relevant ideas. The authors' evaluation framework is rigorous and comprehensive, and the experimental results demonstrate the effectiveness of GYWI. However, the limited generalizability of the approach and its dependence on pre-trained LLMs are notable limitations. Furthermore, the potential implications of GYWI for the development of AI-powered research tools and platforms require careful consideration and discussion.

Recommendations

✓ Future research should focus on extending the proposed approach to other domains and datasets, and exploring its application in real-world research settings.
✓ The development of more robust and adaptable LLMs is essential for the widespread adoption of GYWI and similar approaches.

Sources

arXiv - cs.AI

Something extraordinary is coming.

Graph Your Way to Inspiration: Integrating Co-Author Graphs with Retrieval-Augmented Generation for Large Language Model Based Scientific Idea Generation

AI Commentary

Executive Summary

Key Points

Merits

Strength of Hybrid Retrieval Mechanism

Comprehensive Evaluation Framework

Demerits

Limited Generalizability

Dependence on Pre-trained LLMs

Expert Commentary

Recommendations

Sources

Related Articles

Budget-Aware Agentic Routing via Boundary-Guided Training

ImpRIF: Stronger Implicit Reasoning Leads to Better Complex Instruction Following

ACAR: Adaptive Complexity Routing for Multi-Model Ensembles with Auditable Decision …

Urban Vibrancy Embedding and Application on Traffic Prediction

JCG, PC

HSOLLC Co., Ltd.