RVR: Retrieve-Verify-Retrieve for Comprehensive Question Answering
arXiv:2602.18425v1 Announce Type: new Abstract: Comprehensively retrieving diverse documents is crucial to address queries that admit a wide range of valid answers. We introduce retrieve-verify-retrieve (RVR), a multi-round retrieval framework designed to maximize answer coverage. Initially, a retriever takes the...
AnCoder: Anchored Code Generation via Discrete Diffusion Models
arXiv:2602.17688v1 Announce Type: new Abstract: Diffusion language models offer a compelling alternative to autoregressive code generation, enabling global planning and iterative refinement of complex program logic. However, existing approaches fail to respect the rigid structure of programming languages and, as...
Parallel Complex Diffusion for Scalable Time Series Generation
arXiv:2602.17706v1 Announce Type: new Abstract: Modeling long-range dependencies in time series generation poses a fundamental trade-off between representational capacity and computational efficiency. Traditional temporal diffusion models suffer from local entanglement and the $\mathcal{O}(L^2)$ cost of attention mechanisms. We address these...
Provable Adversarial Robustness in In-Context Learning
arXiv:2602.17743v1 Announce Type: new Abstract: Large language models adapt to new tasks through in-context learning (ICL) without parameter updates. Current theoretical explanations for this capability assume test tasks are drawn from a distribution similar to that seen during pretraining. This...
Grassmannian Mixture-of-Experts: Concentration-Controlled Routing on Subspace Manifolds
arXiv:2602.17798v1 Announce Type: new Abstract: Mixture-of-Experts models rely on learned routers to assign tokens to experts, yet standard softmax gating provides no principled mechanism to control the tradeoff between sparsity and utilization. We propose Grassmannian MoE (GrMoE), a routing framework...
MePoly: Max Entropy Polynomial Policy Optimization
arXiv:2602.17832v1 Announce Type: new Abstract: Stochastic Optimal Control provides a unified mathematical framework for solving complex decision-making problems, encompassing paradigms such as maximum entropy reinforcement learning(RL) and imitation learning(IL). However, conventional parametric policies often struggle to represent the multi-modality of...
Two Calm Ends and the Wild Middle: A Geometric Picture of Memorization in Diffusion Models
arXiv:2602.17846v1 Announce Type: new Abstract: Diffusion models generate high-quality samples but can also memorize training data, raising serious privacy concerns. Understanding the mechanisms governing when memorization versus generalization occurs remains an active area of research. In particular, it is unclear...
JAX-Privacy: A library for differentially private machine learning
arXiv:2602.17861v1 Announce Type: new Abstract: JAX-Privacy is a library designed to simplify the deployment of robust and performant mechanisms for differentially private machine learning. Guided by design principles of usability, flexibility, and efficiency, JAX-Privacy serves both researchers requiring deep customization...
Google’s Cloud AI leads on the three frontiers of model capability
AI models are pushing against three frontiers at once: raw intelligence, response time, and a third quality you might call "extensibility."
Particle’s AI news app listens to podcasts for interesting clips so you you don’t have to
AI news app Particle can now pull in key moments from podcasts, letting readers instantly play short, relevant clips alongside related stories.
How AI agents could destroy the economy
Citrini Research imagines a report from two years in the future, in which unemployment has doubled and the total value of the stock market has fallen by more than a third.
When Remembering and Planning are Worth it: Navigating under Change
arXiv:2602.15274v1 Announce Type: new Abstract: We explore how different types and uses of memory can aid spatial navigation in changing uncertain environments. In the simple foraging task we study, every day, our agent has to find its way from its...
World-Model-Augmented Web Agents with Action Correction
arXiv:2602.15384v1 Announce Type: new Abstract: Web agents based on large language models have demonstrated promising capability in automating web tasks. However, current web agents struggle to reason out sensible actions due to the limitations of predicting environment changes, and might...
Common Belief Revisited
arXiv:2602.15403v1 Announce Type: new Abstract: Contrary to common belief, common belief is not KD4. If individual belief is KD45, common belief does indeed lose the 5 property and keep the D and 4 properties -- and it has none of...
RUVA: Personalized Transparent On-Device Graph Reasoning
arXiv:2602.15553v1 Announce Type: new Abstract: The Personal AI landscape is currently dominated by "Black Box" Retrieval-Augmented Generation. While standard vector databases offer statistical matching, they suffer from a fundamental lack of accountability: when an AI hallucinates or retrieves sensitive data,...
On inferring cumulative constraints
arXiv:2602.15635v1 Announce Type: new Abstract: Cumulative constraints are central in scheduling with constraint programming, yet propagation is typically performed per constraint, missing multi-resource interactions and causing severe slowdowns on some benchmarks. I present a preprocessing method for inferring additional cumulative...
CARE Drive A Framework for Evaluating Reason-Responsiveness of Vision Language Models in Automated Driving
arXiv:2602.15645v1 Announce Type: new Abstract: Foundation models, including vision language models, are increasingly used in automated driving to interpret scenes, recommend actions, and generate natural language explanations. However, existing evaluation methods primarily assess outcome based performance, such as safety and...
Recursive Concept Evolution for Compositional Reasoning in Large Language Models
arXiv:2602.15725v1 Announce Type: new Abstract: Large language models achieve strong performance on many complex reasoning tasks, yet their accuracy degrades sharply on benchmarks that require compositional reasoning, including ARC-AGI-2, GPQA, MATH, BBH, and HLE. Existing methods improve reasoning by expanding...
Developing AI Agents with Simulated Data: Why, what, and how?
arXiv:2602.15816v1 Announce Type: new Abstract: As insufficient data volume and quality remain the key impediments to the adoption of modern subsymbolic AI, techniques of synthetic data generation are in high demand. Simulation offers an apt, systematic approach to generating diverse...
CLOT: Closed-Loop Global Motion Tracking for Whole-Body Humanoid Teleoperation
arXiv:2602.15060v1 Announce Type: cross Abstract: Long-horizon whole-body humanoid teleoperation remains challenging due to accumulated global pose drift, particularly on full-sized humanoids. Although recent learning-based tracking methods enable agile and coordinated motions, they typically operate in the robot's local frame and...
Structural Divergence Between AI-Agent and Human Social Networks in Moltbook
arXiv:2602.15064v1 Announce Type: cross Abstract: Large populations of AI agents are increasingly embedded in online environments, yet little is known about how their collective interaction patterns compare to human social systems. Here, we analyze the full interaction network of Moltbook,...
StrokeNeXt: A Siamese-encoder Approach for Brain Stroke Classification in Computed Tomography Imagery
arXiv:2602.15087v1 Announce Type: cross Abstract: We present StrokeNeXt, a model for stroke classification in 2D Computed Tomography (CT) images. StrokeNeXt employs a dual-branch design with two ConvNeXt encoders, whose features are fused through a lightweight convolutional decoder based on stacked...
Extracting Consumer Insight from Text: A Large Language Model Approach to Emotion and Evaluation Measurement
arXiv:2602.15312v1 Announce Type: new Abstract: Accurately measuring consumer emotions and evaluations from unstructured text remains a core challenge for marketing research and practice. This study introduces the Linguistic eXtractor (LX), a fine-tuned, large language model trained on consumer-authored text that...
Far Out: Evaluating Language Models on Slang in Australian and Indian English
arXiv:2602.15373v1 Announce Type: new Abstract: Language models exhibit systematic performance gaps when processing text in non-standard language varieties, yet their ability to comprehend variety-specific slang remains underexplored for several languages. We present a comprehensive evaluation of slang awareness in Indian...
Measuring Social Integration Through Participation: Categorizing Organizations and Leisure Activities in the Displaced Karelians Interview Archive using LLMs
arXiv:2602.15436v1 Announce Type: new Abstract: Digitized historical archives make it possible to study everyday social life on a large scale, but the information extracted directly from text often does not directly allow one to answer the research questions posed by...
LuxMT Technical Report
arXiv:2602.15506v1 Announce Type: new Abstract: We introduce LuxMT, a machine translation system based on Gemma 3 27B and fine-tuned for translation from Luxembourgish (LB) into French (FR) and English (EN). To assess translation performance, we construct a novel benchmark covering...
ZeroSyl: Simple Zero-Resource Syllable Tokenization for Spoken Language Modeling
arXiv:2602.15537v1 Announce Type: new Abstract: Pure speech language models aim to learn language directly from raw audio without textual resources. A key challenge is that discrete tokens from self-supervised speech encoders result in excessively long sequences, motivating recent work on...
jina-embeddings-v5-text: Task-Targeted Embedding Distillation
arXiv:2602.15547v1 Announce Type: new Abstract: Text embedding models are widely used for semantic similarity tasks, including information retrieval, clustering, and classification. General-purpose models are typically trained with single- or multi-stage processes using contrastive loss functions. We introduce a novel training...
Clinically Inspired Symptom-Guided Depression Detection from Emotion-Aware Speech Representations
arXiv:2602.15578v1 Announce Type: new Abstract: Depression manifests through a diverse set of symptoms such as sleep disturbance, loss of interest, and concentration difficulties. However, most existing works treat depression prediction either as a binary label or an overall severity score...
Rethinking Metrics for Lexical Semantic Change Detection
arXiv:2602.15716v1 Announce Type: new Abstract: Lexical semantic change detection (LSCD) increasingly relies on contextualised language model embeddings, yet most approaches still quantify change using a small set of semantic change metrics, primarily Average Pairwise Distance (APD) and cosine distance over...