AI Agents for Inventory Control: Human-LLM-OR Complementarity
arXiv:2602.12631v1 Announce Type: new Abstract: Inventory control is a fundamental operations problem in which ordering decisions are traditionally guided by theoretically grounded operations research (OR) algorithms. However, such algorithms often rely on rigid modeling assumptions and can perform poorly when...
Consistency of Large Reasoning Models Under Multi-Turn Attacks
arXiv:2602.13093v2 Announce Type: new Abstract: Large reasoning models with reasoning capabilities achieve state-of-the-art performance on complex tasks, but their robustness under multi-turn adversarial pressure remains underexplored. We evaluate nine frontier reasoning models under adversarial attacks. Our findings reveal that reasoning...
Optimal Take-off under Fuzzy Clearances
arXiv:2602.13166v1 Announce Type: new Abstract: This paper presents a hybrid obstacle avoidance architecture that integrates Optimal Control under clearance with a Fuzzy Rule Based System (FRBS) to enable adaptive constraint handling for unmanned aircraft. Motivated by the limitations of classical...
Retrieval-Augmented Self-Taught Reasoning Model with Adaptive Chain-of-Thought for ASR Named Entity Correction
arXiv:2602.12287v1 Announce Type: cross Abstract: End-to-end automatic speech recognition (ASR) systems frequently misrecognize domain-specific phrases like named entities, which can cause catastrophic failures in downstream tasks. A new family of named entity correction methods based on large language models (LLMs)...
ForeAct: Steering Your VLA with Efficient Visual Foresight Planning
arXiv:2602.12322v1 Announce Type: cross Abstract: Vision-Language-Action (VLA) models convert high-level language instructions into concrete, executable actions, a task that is especially challenging in open-world environments. We present Visual Foresight Planning (ForeAct), a general and efficient planner that guides a VLA...
Rational Neural Networks have Expressivity Advantages
arXiv:2602.12390v1 Announce Type: cross Abstract: We study neural networks with trainable low-degree rational activation functions and show that they are more expressive and parameter-efficient than modern piecewise-linear and smooth activations such as ELU, LeakyReLU, LogSigmoid, PReLU, ReLU, SELU, CELU, Sigmoid,...
Reproducing DragDiffusion: Interactive Point-Based Editing with Diffusion Models
arXiv:2602.12393v1 Announce Type: cross Abstract: DragDiffusion is a diffusion-based method for interactive point-based image editing that enables users to manipulate images by directly dragging selected points. The method claims that accurate spatial control can be achieved by optimizing a single...
What does RL improve for Visual Reasoning? A Frankenstein-Style Analysis
arXiv:2602.12395v1 Announce Type: cross Abstract: Reinforcement learning (RL) with verifiable rewards has become a standard post-training stage for boosting visual reasoning in vision-language models, yet it remains unclear what capabilities RL actually improves compared with supervised fine-tuning as cold-start initialization...
Soft Contamination Means Benchmarks Test Shallow Generalization
arXiv:2602.12413v1 Announce Type: cross Abstract: If LLM training data is polluted with benchmark test data, then benchmark performance gives biased estimates of out-of-distribution (OOD) generalization. Typical decontamination filters use n-gram matching which fail to detect semantic duplicates: sentences with equivalent...
CacheMind: From Miss Rates to Why -- Natural-Language, Trace-Grounded Reasoning for Cache Replacement
arXiv:2602.12422v1 Announce Type: cross Abstract: Cache replacement remains a challenging problem in CPU microarchitecture, often addressed using hand-crafted heuristics, limiting cache performance. Cache data analysis requires parsing millions of trace entries with manual filtering, making the process slow and non-interactive....
Agent Skills for Large Language Models: Architecture, Acquisition, Security, and the Path Forward
arXiv:2602.12430v2 Announce Type: cross Abstract: The transition from monolithic language models to modular, skill-equipped agents marks a defining shift in how large language models (LLMs) are deployed in practice. Rather than encoding all procedural knowledge within model weights, agent skills...
Grandes Modelos de Linguagem Multimodais (MLLMs): Da Teoria \`a Pr\'atica
arXiv:2602.12302v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) combine the natural language understanding and generation capabilities of LLMs with perception skills in modalities such as image and audio, representing a key advancement in contemporary AI. This chapter presents...
CLASE: A Hybrid Method for Chinese Legalese Stylistic Evaluation
arXiv:2602.12639v1 Announce Type: new Abstract: Legal text generated by large language models (LLMs) can usually achieve reasonable factual accuracy, but it frequently fails to adhere to the specialised stylistic norms and linguistic conventions of legal writing. In order to improve...
Beyond Normalization: Rethinking the Partition Function as a Difficulty Scheduler for RLVR
arXiv:2602.12642v1 Announce Type: new Abstract: Reward-maximizing RL methods enhance the reasoning performance of LLMs, but often reduce the diversity among outputs. Recent works address this issue by adopting GFlowNets, training LLMs to match a target distribution while jointly learning its...
Learning Ordinal Probabilistic Reward from Preferences
arXiv:2602.12660v1 Announce Type: new Abstract: Reward models are crucial for aligning large language models (LLMs) with human values and intentions. Existing approaches follow either Generative (GRMs) or Discriminative (DRMs) paradigms, yet both suffer from limitations: GRMs typically demand costly point-wise...
AIWizards at MULTIPRIDE: A Hierarchical Approach to Slur Reclamation Detection
arXiv:2602.12818v1 Announce Type: new Abstract: Detecting reclaimed slurs represents a fundamental challenge for hate speech detection systems, as the same lexcal items can function either as abusive expressions or as in-group affirmations depending on social identity and context. In this...
Know More, Know Clearer: A Meta-Cognitive Framework for Knowledge Augmentation in Large Language Models
arXiv:2602.12996v1 Announce Type: new Abstract: Knowledge augmentation has significantly enhanced the performance of Large Language Models (LLMs) in knowledge-intensive tasks. However, existing methods typically operate on the simplistic premise that model performance equates with internal knowledge, overlooking the knowledge-confidence gaps...
Can we trust AI to detect healthy multilingual English speakers among the cognitively impaired cohort in the UK? An investigation using real-world conversational speech
arXiv:2602.13047v1 Announce Type: new Abstract: Conversational speech often reveals early signs of cognitive decline, such as dementia and MCI. In the UK, one in four people belongs to an ethnic minority, and dementia prevalence is expected to rise most rapidly...
Exploring a New Competency Modeling Process with Large Language Models
arXiv:2602.13084v1 Announce Type: new Abstract: Competency modeling is widely used in human resource management to select, develop, and evaluate talent. However, traditional expert-driven approaches rely heavily on manual analysis of large volumes of interview transcripts, making them costly and prone...
Towards interpretable models for language proficiency assessment: Predicting the CEFR level of Estonian learner texts
arXiv:2602.13102v1 Announce Type: new Abstract: Using NLP to analyze authentic learner language helps to build automated assessment and feedback tools. It also offers new and extensive insights into the development of second language production. However, there is a lack of...
OpenLID-v3: Improving the Precision of Closely Related Language Identification -- An Experience Report
arXiv:2602.13139v1 Announce Type: new Abstract: Language identification (LID) is an essential step in building high-quality multilingual datasets from web data. Existing LID tools (such as OpenLID or GlotLID) often struggle to identify closely related languages and to distinguish valid natural...
Constraint-Rectified Training for Efficient Chain-of-Thought
arXiv:2602.12526v1 Announce Type: cross Abstract: Chain-of-Thought (CoT) has significantly enhanced the reasoning capabilities of Large Language Models (LLMs), especially when combined with reinforcement learning (RL) based post-training methods. While longer reasoning traces can improve answer quality and unlock abilities such...
DiffuRank: Effective Document Reranking with Diffusion Language Models
arXiv:2602.12528v1 Announce Type: cross Abstract: Recent advances in large language models (LLMs) have inspired new paradigms for document reranking. While this paradigm better exploits the reasoning and contextual understanding capabilities of LLMs, most existing LLM-based rerankers rely on autoregressive generation,...
HyperMLP: An Integrated Perspective for Sequence Modeling
arXiv:2602.12601v1 Announce Type: cross Abstract: Self-attention is often viewed as probabilistic query-key lookup, motivating designs that preserve normalized attention scores and fixed positional semantics. We advocate a simpler and more unified perspective: an autoregressive attention head can be viewed as...
VimRAG: Navigating Massive Visual Context in Retrieval-Augmented Generation via Multimodal Memory Graph
arXiv:2602.12735v1 Announce Type: cross Abstract: Effectively retrieving, reasoning, and understanding multimodal information remains a critical challenge for agentic systems. Traditional Retrieval-augmented Generation (RAG) methods rely on linear interaction histories, which struggle to handle long-context tasks, especially those involving information-sparse yet...
Abstractive Red-Teaming of Language Model Character
arXiv:2602.12318v1 Announce Type: new Abstract: We want language model assistants to conform to a character specification, which asserts how the model should act across diverse user interactions. While models typically follow these character specifications, they can occasionally violate them in...
The Appeal and Reality of Recycling LoRAs with Adaptive Merging
arXiv:2602.12323v1 Announce Type: new Abstract: The widespread availability of fine-tuned LoRA modules for open pre-trained models has led to an interest in methods that can adaptively merge LoRAs to improve performance. These methods typically include some way of selecting LoRAs...
Deep Doubly Debiased Longitudinal Effect Estimation with ICE G-Computation
arXiv:2602.12379v1 Announce Type: new Abstract: Estimating longitudinal treatment effects is essential for sequential decision-making but is challenging due to treatment-confounder feedback. While Iterative Conditional Expectation (ICE) G-computation offers a principled approach, its recursive structure suffers from error propagation, corrupting the...
Stabilizing Native Low-Rank LLM Pretraining
arXiv:2602.12429v1 Announce Type: new Abstract: Foundation models have achieved remarkable success, yet their growing parameter counts pose significant computational and memory challenges. Low-rank factorization offers a promising route to reduce training and inference costs, but the community lacks a stable...
A Theoretical Analysis of Mamba's Training Dynamics: Filtering Relevant Features for Generalization in State Space Models
arXiv:2602.12499v1 Announce Type: new Abstract: The recent empirical success of Mamba and other selective state space models (SSMs) has renewed interest in non-attention architectures for sequence modeling, yet their theoretical foundations remain underexplored. We present a first-step analysis of generalization...