Skip to main content
Academic

From Logs to Language: Learning Optimal Verbalization for LLM-Based Recommendation in Production

arXiv:2602.20558v1 Announce Type: new Abstract: Large language models (LLMs) are promising backbones for generative recommender systems, yet a key challenge remains underexplored: verbalization, i.e., converting structured user interaction logs into effective natural language inputs. Existing methods rely on rigid templates that simply concatenate fields, yielding suboptimal representations for recommendation. We propose a data-centric framework that learns verbalization for LLM-based recommendation. Using reinforcement learning, a verbalization agent transforms raw interaction histories into optimized textual contexts, with recommendation accuracy as the training signal. This agent learns to filter noise, incorporate relevant metadata, and reorganize information to improve downstream predictions. Experiments on a large-scale industrial streaming dataset show that learned verbalization delivers up to 93% relative improvement in discovery item recommendation accuracy over template-base

arXiv:2602.20558v1 Announce Type: new Abstract: Large language models (LLMs) are promising backbones for generative recommender systems, yet a key challenge remains underexplored: verbalization, i.e., converting structured user interaction logs into effective natural language inputs. Existing methods rely on rigid templates that simply concatenate fields, yielding suboptimal representations for recommendation. We propose a data-centric framework that learns verbalization for LLM-based recommendation. Using reinforcement learning, a verbalization agent transforms raw interaction histories into optimized textual contexts, with recommendation accuracy as the training signal. This agent learns to filter noise, incorporate relevant metadata, and reorganize information to improve downstream predictions. Experiments on a large-scale industrial streaming dataset show that learned verbalization delivers up to 93% relative improvement in discovery item recommendation accuracy over template-based baselines. Further analysis reveals emergent strategies such as user interest summarization, noise removal, and syntax normalization, offering insights into effective context construction for LLM-based recommender systems.

Executive Summary

This study addresses the underexplored challenge of verbalization in large language model (LLM)-based recommendation systems. The authors propose a data-centric framework that learns verbalization using reinforcement learning, transforming raw interaction histories into optimized textual contexts for recommendation accuracy. Experimental results on an industrial streaming dataset demonstrate a 93% relative improvement in discovery item recommendation accuracy over template-based baselines. The emergent strategies, including user interest summarization, noise removal, and syntax normalization, offer valuable insights into effective context construction. This work contributes significantly to the field of LLM-based recommender systems, highlighting the importance of verbalization and context construction for accurate recommendations. Its findings have practical implications for improving recommendation systems in various industries, including e-commerce and media streaming.

Key Points

  • Proposed a data-centric framework that learns verbalization using reinforcement learning
  • Transformed raw interaction histories into optimized textual contexts for recommendation accuracy
  • Achieved a 93% relative improvement in discovery item recommendation accuracy over template-based baselines
  • Emphasized the importance of verbalization and context construction for accurate recommendations

Merits

Strengths of the Framework

The proposed data-centric framework is flexible, scalable, and adaptable to various recommendation tasks, allowing for the discovery of optimal verbalization strategies through reinforcement learning.

Empirical Evidence

The experimental results on an industrial streaming dataset demonstrate the effectiveness of the framework in improving recommendation accuracy, providing strong evidence for its practical application.

Demerits

Limitation on Generalizability

The study relied on a specific industrial streaming dataset, which may limit the generalizability of the findings to other domains or recommendation tasks.

Complexity of the Framework

The proposed framework involves reinforcement learning, which can be computationally expensive and require significant expertise in machine learning.

Expert Commentary

The study's proposed framework and experimental results provide valuable insights into the role of verbalization and context construction in LLM-based recommender systems. However, the study's limitations on generalizability and complexity of the framework should be addressed in future research. Additionally, the study's emphasis on the importance of verbalization and context construction highlights the need for further research on these topics. This study contributes significantly to the field of LLM-based recommender systems and has important implications for the development and improvement of recommendation systems in various industries.

Recommendations

  • Further research should be conducted to explore the generalizability of the framework to other domains and recommendation tasks.
  • The proposed framework should be applied to other types of recommendation tasks, such as rating prediction or ranking.

Sources