Academic

Quality Over Clicks: Intrinsic Quality-Driven Iterative Reinforcement Learning for Cold-Start E-Commerce Query Suggestion

arXiv:2603.22922v1 Announce Type: new Abstract: Existing dialogue systems rely on Query Suggestion (QS) to enhance user engagement. Recent efforts typically employ large language models with Click-Through Rate (CTR) model, yet fail in cold-start scenarios due to their heavy reliance on abundant online click data for effective CTR model training. To bridge this gap, we propose Cold-EQS, an iterative reinforcement learning framework for Cold-Start E-commerce Query Suggestion (EQS). Specifically, we leverage answerability, factuality, and information gain as reward to continuously optimize the quality of suggested queries. To continuously optimize our QS model, we estimate uncertainty for grouped candidate suggested queries to select hard and ambiguous samples from online user queries lacking click signals. In addition, we provide an EQS-Benchmark comprising 16,949 online user queries for offline training and evaluation. Extensive offline and online experiments consistently demonstrate a

Qi Sun, Kejun Xiao, Huaipeng Zhao, Tao Luo, Xiaoyi Zeng · March 25, 2026 · 1 min read · 1 views

#cs.CL

Executive Summary

The article introduces Cold-EQS, an innovative iterative reinforcement learning framework designed to address cold-start challenges in e-commerce query suggestion by shifting focus from click-through rate (CTR) metrics to intrinsic quality indicators—answerability, factuality, and information gain. This approach circumvents the dependency on extensive online click data, which hampers traditional CTR-based models in cold-start contexts. The framework iteratively refines query suggestions through uncertainty estimation of grouped candidate queries, enabling targeted optimization on ambiguous or lacking-signal cases. Empirical results across offline and online evaluations validate the framework’s effectiveness, yielding a measurable +6.81% improvement in online chatUV. The inclusion of a benchmark dataset (16,949 queries) enhances reproducibility and applicability.

Key Points

▸ Shift from CTR to intrinsic quality metrics (answerability, factuality, information gain)
▸ Cold-EQS leverages uncertainty estimation to select hard/ambiguous samples without click signals
▸ Empirical validation shows +6.81% improvement in online engagement metrics

Merits

Innovative Framework

Cold-EQS introduces a novel paradigm by prioritizing intrinsic content quality over user behavior metrics, offering a sustainable solution for cold-start scenarios where data scarcity is inherent.

Empirical Validation

The consistent positive correlation between offline and online performance across experiments strengthens the credibility of the proposed methodology.

Benchmark Contribution

The provision of a curated benchmark dataset enhances transparency and facilitates broader adoption and replication.

Demerits

Scalability Concern

The reliance on uncertainty estimation for sample selection may introduce computational overhead in large-scale environments with high query volumes.

Limited Generalizability

Results are primarily validated within e-commerce query contexts; applicability to other domains (e.g., healthcare, legal search) remains unproven.

Expert Commentary

Cold-EQS represents a paradigm shift in the design of recommendation and suggestion systems. Historically, CTR-based optimization has dominated due to its measurable outcomes and alignement with revenue metrics. However, this article rightly identifies a critical flaw: CTR models are inherently reactive, dependent on post-hoc user behavior, making them inapplicable in cold-start contexts. By pivoting to intrinsic quality indicators, the authors address the root cause—data dependency—rather than its symptoms. The use of uncertainty estimation as a mechanism for sampling hard cases is particularly elegant, as it transforms a limitation (lack of click data) into a feature (opportunity for targeted refinement). Moreover, the benchmark dataset serves as a foundational resource, enabling empirical validation and comparative analysis across future studies. This work is seminal because it redefines the evaluation criteria for suggestion systems: quality is no longer a secondary consideration but the primary objective. It is a significant advancement that aligns with broader trends toward ethical, user-centric AI design. One potential avenue for future research is to integrate subjective quality assessments (e.g., user surveys) alongside objective metrics to create a hybrid evaluation framework.

Recommendations

✓ Platform developers should pilot Cold-EQS in A/B testing environments to measure impact on user engagement and retention.
✓ Academic institutions should replicate the benchmark dataset with domain-specific variations to test cross-sector applicability.

Sources

Original: arXiv - cs.CL

arXiv - cs.CL

Quality Over Clicks: Intrinsic Quality-Driven Iterative Reinforcement Learning for Cold-Start E-Commerce Query Suggestion

AI Commentary

Executive Summary

Key Points

Merits

Innovative Framework

Empirical Validation

Benchmark Contribution

Demerits

Scalability Concern

Limited Generalizability

Expert Commentary

Recommendations

Sources

Related Articles

Cross-subject Muscle Fatigue Detection via Adversarial and Supervised Contrastive Learning …

A Numerical Method for Coupling Parameterized Physics-Informed Neural Networks and …

Low-Rank Compression of Pretrained Models via Randomized Subspace Iteration

Product-Stability: Provable Convergence for Gradient Descent on the Edge of …

JCG, PC

HSOLLC Co., Ltd.