Knob: A Physics-Inspired Gating Interface for Interpretable and Controllable Neural Dynamics
arXiv:2602.22702v1 Announce Type: new Abstract: Existing neural network calibration methods often treat calibration as a static, post-hoc optimization task. However, this neglects the dynamic and temporal nature of real-world inference. Moreover, existing methods do not provide an intuitive interface enabling...
RLHFless: Serverless Computing for Efficient RLHF
arXiv:2602.22718v1 Announce Type: new Abstract: Reinforcement Learning from Human Feedback (RLHF) has been widely applied to Large Language Model (LLM) post-training to align model outputs with human preferences. Recent models, such as DeepSeek-R1, have also shown RLHF's potential to improve...
Know What You Know: Metacognitive Entropy Calibration for Verifiable RL Reasoning
arXiv:2602.22751v1 Announce Type: new Abstract: Large reasoning models (LRMs) have emerged as a powerful paradigm for solving complex real-world tasks. In practice, these models are predominantly trained via Reinforcement Learning with Verifiable Rewards (RLVR), yet most existing outcome-only RLVR pipelines...
AMA-Bench: Evaluating Long-Horizon Memory for Agentic Applications
arXiv:2602.22769v1 Announce Type: new Abstract: Large Language Models (LLMs) are deployed as autonomous agents in increasingly complex applications, where enabling long-horizon memory is critical for achieving strong performance. However, a significant gap exists between practical applications and current evaluation standards...
DeepPresenter: Environment-Grounded Reflection for Agentic Presentation Generation
arXiv:2602.22839v1 Announce Type: new Abstract: Presentation generation requires deep content research, coherent visual design, and iterative refinement based on observation. However, existing presentation agents often rely on predefined workflows and fixed templates. To address this, we present DeepPresenter, an agentic...
The AI Research Assistant: Promise, Peril, and a Proof of Concept
arXiv:2602.22842v1 Announce Type: new Abstract: Can artificial intelligence truly contribute to creative mathematical research, or does it merely automate routine calculations while introducing risks of error? We provide empirical evidence through a detailed case study: the discovery of novel error...
OmniGAIA: Towards Native Omni-Modal AI Agents
arXiv:2602.22897v1 Announce Type: new Abstract: Human intelligence naturally intertwines omni-modal perception -- spanning vision, audio, and language -- with complex reasoning and tool usage to interact with the world. However, current multi-modal LLMs are primarily confined to bi-modal interactions (e.g.,...
Modeling Expert AI Diagnostic Alignment via Immutable Inference Snapshots
arXiv:2602.22973v1 Announce Type: new Abstract: Human-in-the-loop validation is essential in safety-critical clinical AI, yet the transition between initial model inference and expert correction is rarely analyzed as a structured signal. We introduce a diagnostic alignment framework in which the AI-generated...
Enhancing CVRP Solver through LLM-driven Automatic Heuristic Design
arXiv:2602.23092v1 Announce Type: new Abstract: The Capacitated Vehicle Routing Problem (CVRP), a fundamental combinatorial optimization challenge, focuses on optimizing fleet operations under vehicle capacity constraints. While extensively studied in operational research, the NP-hard nature of CVRP continues to pose significant...
Decoder-based Sense Knowledge Distillation
arXiv:2602.22351v1 Announce Type: new Abstract: Large language models (LLMs) learn contextual embeddings that capture rich semantic information, yet they often overlook structured lexical knowledge such as word senses and relationships. Prior work has shown that incorporating sense dictionaries can improve...
Scaling In, Not Up? Testing Thick Citation Context Analysis with GPT-5 and Fragile Prompts
arXiv:2602.22359v1 Announce Type: new Abstract: This paper tests whether large language models (LLMs) can support interpretative citation context analysis (CCA) by scaling in thick, text-grounded readings of a single hard case rather than scaling up typological labels. It foregrounds prompt-sensitivity...
Causality $\neq$ Invariance: Function and Concept Vectors in LLMs
arXiv:2602.22424v1 Announce Type: new Abstract: Do large language models (LLMs) represent concepts abstractly, i.e., independent of input format? We revisit Function Vectors (FVs), compact representations of in-context learning (ICL) tasks that causally drive task performance. Across multiple LLMs, we show...
A Fusion of context-aware based BanglaBERT and Two-Layer Stacked LSTM Framework for Multi-Label Cyberbullying Detection
arXiv:2602.22449v1 Announce Type: new Abstract: Cyberbullying has become a serious and growing concern in todays virtual world. When left unnoticed, it can have adverse consequences for social and mental health. Researchers have explored various types of cyberbullying, but most approaches...
Bridging Latent Reasoning and Target-Language Generation via Retrieval-Transition Heads
arXiv:2602.22453v1 Announce Type: new Abstract: Recent work has identified a subset of attention heads in Transformer as retrieval heads, which are responsible for retrieving information from the context. In this work, we first investigate retrieval heads in multilingual contexts. In...
Mind the Gap in Cultural Alignment: Task-Aware Culture Management for Large Language Models
arXiv:2602.22475v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly deployed in culturally sensitive real-world tasks. However, existing cultural alignment approaches fail to align LLMs' broad cultural values with the specific goals of downstream tasks and suffer from cross-culture...
Iterative Prompt Refinement for Dyslexia-Friendly Text Summarization Using GPT-4o
arXiv:2602.22524v1 Announce Type: new Abstract: Dyslexia affects approximately 10% of the global population and presents persistent challenges in reading fluency and text comprehension. While existing assistive technologies address visual presentation, linguistic complexity remains a substantial barrier to equitable access. This...
Search-P1: Path-Centric Reward Shaping for Stable and Efficient Agentic RAG Training
arXiv:2602.22576v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by incorporating external knowledge, yet traditional single-round retrieval struggles with complex multi-step reasoning. Agentic RAG addresses this by enabling LLMs to dynamically decide when and what to...
Enhancing Persuasive Dialogue Agents by Synthesizing Cross-Disciplinary Communication Strategies
arXiv:2602.22696v1 Announce Type: new Abstract: Current approaches to developing persuasive dialogue agents often rely on a limited set of predefined persuasive strategies that fail to capture the complexity of real-world interactions. We applied a cross-disciplinary approach to develop a framework...
The Poly Problem in Zoning: Redefining “Family” for a Changing Society lawreview - Minnesota Law Review
By ARIC SHORT & TANYA PIERCE. Full Text. Single-family zoning has long dictated not only where people may live but also with whom. Although extensively critiqued for perpetuating racial and economic exclusion, these laws also privilege relationships defined by blood,...
The Innocence Trap lawreview - Minnesota Law Review
By CAITLIN GLASS & JULIAN GREEN. Full Text. What makes a conviction wrongful? Developments in DNA science have led to a wave of exonerations over the past thirty years, revealing sources of error in the criminal legal process. Innocence organizations...
The Skidmore Compromise: Interpreting Skidmore as a Tiebreaker to Preserve Judicial Wisdom in the Era of Loper Bright lawreview - Minnesota Law Review
By MITCHELL ZAIC. Full Text. 'Law must be stable, and yet it cannot stand still.' Here is the great antinomy confronting us at every turn. Rest and motion, unrelieved and unchecked, are equally destructive. The law, like human kind, if...
The Crisis in U.S. Cancer Care: Law, Markets, and Privatization lawreview - Minnesota Law Review
By DANIEL G. AARON. Full Text. Cancer is surging among youth and young adults in the United States, yet, instead of public regulation addressing its root causes, we have outsourced the management of cancer to the private sector. A suite...
The Rise of AI-Powered Legal Research: Transforming How Lawyers Work
AI-powered legal research tools are fundamentally changing the practice of law, offering unprecedented efficiency while raising questions about quality and oversight.
The Emerging Legal Framework for Generative AI: A Comprehensive Analysis
As generative AI transforms industries worldwide, legal systems are racing to establish frameworks that balance innovation with accountability.
Reinforcing Real-world Service Agents: Balancing Utility and Cost in Task-oriented Dialogue
arXiv:2602.22697v1 Announce Type: new Abstract: The rapid evolution of Large Language Models (LLMs) has accelerated the transition from conversational chatbots to general agents. However, effectively balancing empathetic communication with budget-aware decision-making remains an open challenge. Since existing methods fail to...
AuditBench: Evaluating Alignment Auditing Techniques on Models with Hidden Behaviors
arXiv:2602.22755v1 Announce Type: new Abstract: We introduce AuditBench, an alignment auditing benchmark. AuditBench consists of 56 language models with implanted hidden behaviors. Each model has one of 14 concerning behaviors--such as sycophantic deference, opposition to AI regulation, or secret geopolitical...
Towards Better RL Training Data Utilization via Second-Order Rollout
arXiv:2602.22765v1 Announce Type: new Abstract: Reinforcement Learning (RL) has empowered Large Language Models (LLMs) with strong reasoning capabilities, but vanilla RL mainly focuses on generation capability improvement by training with only first-order rollout (generating multiple responses for a question), and...
Probing for Knowledge Attribution in Large Language Models
arXiv:2602.22787v1 Announce Type: new Abstract: Large language models (LLMs) often generate fluent but unfounded claims, or hallucinations, which fall into two types: (i) faithfulness violations - misusing user context - and (ii) factuality violations - errors from internal knowledge. Proper...
Test-Time Scaling with Diffusion Language Models via Reward-Guided Stitching
arXiv:2602.22871v1 Announce Type: new Abstract: Reasoning with large language models often benefits from generating multiple chains-of-thought, but existing aggregation strategies are typically trajectory-level (e.g., selecting the best trace or voting on the final answer), discarding useful intermediate work from partial...
Where Vision Becomes Text: Locating the OCR Routing Bottleneck in Vision-Language Models
arXiv:2602.22918v1 Announce Type: new Abstract: Vision-language models (VLMs) can read text from images, but where does this optical character recognition (OCR) information enter the language processing stream? We investigate the OCR routing mechanism across three architecture families (Qwen3-VL, Phi-4, InternVL3.5)...