Autonomous Vehicles and Liability: Who Is Responsible When AI Drives?
As autonomous vehicles approach widespread deployment, legal frameworks for determining liability in accidents involving self-driving cars remain uncertain.
ImpRIF: Stronger Implicit Reasoning Leads to Better Complex Instruction Following
arXiv:2602.21228v1 Announce Type: cross Abstract: As applications of large language models (LLMs) become increasingly complex, the demand for robust complex instruction following capabilities is growing accordingly. We argue that a thorough understanding of the instruction itself, especially the latent reasoning...
Group Orthogonalized Policy Optimization:Group Policy Optimization as Orthogonal Projection in Hilbert Space
arXiv:2602.21269v1 Announce Type: cross Abstract: We present Group Orthogonalized Policy Optimization (GOPO), a new alignment algorithm for large language models derived from the geometry of Hilbert function spaces. Instead of optimizing on the probability simplex and inheriting the exponential curvature...
The Mean is the Mirage: Entropy-Adaptive Model Merging under Heterogeneous Domain Shifts in Medical Imaging
arXiv:2602.21372v1 Announce Type: cross Abstract: Model merging under unseen test-time distribution shifts often renders naive strategies, such as mean averaging unreliable. This challenge is especially acute in medical imaging, where models are fine-tuned locally at clinics on private data, producing...
The Headless Firm: How AI Reshapes Enterprise Boundaries
arXiv:2602.21401v1 Announce Type: cross Abstract: The boundary of the firm is determined by coordination cost. We argue that agentic AI induces a structural change in how coordination costs scale: in prior modular systems, integration cost grew with interaction topology (O(n^2)...
Graph Your Way to Inspiration: Integrating Co-Author Graphs with Retrieval-Augmented Generation for Large Language Model Based Scientific Idea Generation
arXiv:2602.22215v1 Announce Type: new Abstract: Large Language Models (LLMs) demonstrate potential in the field of scientific idea generation. However, the generated results often lack controllable academic context and traceable inspiration pathways. To bridge this gap, this paper proposes a scientific...
Agent Behavioral Contracts: Formal Specification and Runtime Enforcement for Reliable Autonomous AI Agents
arXiv:2602.22302v1 Announce Type: new Abstract: Traditional software relies on contracts -- APIs, type systems, assertions -- to specify and enforce correct behavior. AI agents, by contrast, operate on prompts and natural language instructions with no formal behavioral specification. This gap...
Vibe Researching as Wolf Coming: Can AI Agents with Skills Replace or Augment Social Scientists?
arXiv:2602.22401v1 Announce Type: new Abstract: AI agents -- systems that execute multi-step reasoning workflows with persistent state, tool access, and specialist skills -- represent a qualitative shift from prior automation technologies in social science. Unlike chatbots that respond to isolated...
Towards Autonomous Memory Agents
arXiv:2602.22406v1 Announce Type: new Abstract: Recent memory agents improve LLMs by extracting experiences and conversation history into an external storage. This enables low-overhead context assembly and online memory update without expensive LLM training. However, existing solutions remain passive and reactive;...
How Do Latent Reasoning Methods Perform Under Weak and Strong Supervision?
arXiv:2602.22441v1 Announce Type: new Abstract: Latent reasoning has been recently proposed as a reasoning paradigm and performs multi-step reasoning through generating steps in the latent space instead of the textual space. This paradigm enables reasoning beyond discrete language tokens by...
Mirroring the Mind: Distilling Human-Like Metacognitive Strategies into Large Language Models
arXiv:2602.22508v1 Announce Type: new Abstract: Large Reasoning Models (LRMs) often exhibit structural fragility in complex reasoning tasks, failing to produce correct answers even after successfully deriving valid intermediate steps. Through systematic analysis, we observe that these failures frequently stem not...
MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios
arXiv:2602.22638v1 Announce Type: new Abstract: Route-planning agents powered by large language models (LLMs) have emerged as a promising paradigm for supporting everyday human mobility through natural language interaction and tool-mediated decision making. However, systematic evaluation in real-world mobility settings is...
RLHFless: Serverless Computing for Efficient RLHF
arXiv:2602.22718v1 Announce Type: new Abstract: Reinforcement Learning from Human Feedback (RLHF) has been widely applied to Large Language Model (LLM) post-training to align model outputs with human preferences. Recent models, such as DeepSeek-R1, have also shown RLHF's potential to improve...
Know What You Know: Metacognitive Entropy Calibration for Verifiable RL Reasoning
arXiv:2602.22751v1 Announce Type: new Abstract: Large reasoning models (LRMs) have emerged as a powerful paradigm for solving complex real-world tasks. In practice, these models are predominantly trained via Reinforcement Learning with Verifiable Rewards (RLVR), yet most existing outcome-only RLVR pipelines...
ClinDet-Bench: Beyond Abstention, Evaluating Judgment Determinability of LLMs in Clinical Decision-Making
arXiv:2602.22771v1 Announce Type: new Abstract: Clinical decisions are often required under incomplete information. Clinical experts must identify whether available information is sufficient for judgment, as both premature conclusion and unnecessary abstention can compromise patient safety. To evaluate this capability of...
DeepPresenter: Environment-Grounded Reflection for Agentic Presentation Generation
arXiv:2602.22839v1 Announce Type: new Abstract: Presentation generation requires deep content research, coherent visual design, and iterative refinement based on observation. However, existing presentation agents often rely on predefined workflows and fixed templates. To address this, we present DeepPresenter, an agentic...
Obscure but Effective: Classical Chinese Jailbreak Prompt Optimization via Bio-Inspired Search
arXiv:2602.22983v1 Announce Type: new Abstract: As Large Language Models (LLMs) are increasingly used, their security risks have drawn increasing attention. Existing research reveals that LLMs are highly susceptible to jailbreak attacks, with effectiveness varying across language contexts. This paper investigates...
Scaling In, Not Up? Testing Thick Citation Context Analysis with GPT-5 and Fragile Prompts
arXiv:2602.22359v1 Announce Type: new Abstract: This paper tests whether large language models (LLMs) can support interpretative citation context analysis (CCA) by scaling in thick, text-grounded readings of a single hard case rather than scaling up typological labels. It foregrounds prompt-sensitivity...
Causality $\neq$ Invariance: Function and Concept Vectors in LLMs
arXiv:2602.22424v1 Announce Type: new Abstract: Do large language models (LLMs) represent concepts abstractly, i.e., independent of input format? We revisit Function Vectors (FVs), compact representations of in-context learning (ICL) tasks that causally drive task performance. Across multiple LLMs, we show...
Bridging Latent Reasoning and Target-Language Generation via Retrieval-Transition Heads
arXiv:2602.22453v1 Announce Type: new Abstract: Recent work has identified a subset of attention heads in Transformer as retrieval heads, which are responsible for retrieving information from the context. In this work, we first investigate retrieval heads in multilingual contexts. In...
Sydney Telling Fables on AI and Humans: A Corpus Tracing Memetic Transfer of Persona between LLMs
arXiv:2602.22481v1 Announce Type: new Abstract: The way LLM-based entities conceive of the relationship between AI and humans is an important topic for both cultural and safety reasons. When we examine this topic, what matters is not only the model itself...
Importance of Prompt Optimisation for Error Detection in Medical Notes Using Language Models
arXiv:2602.22483v1 Announce Type: new Abstract: Errors in medical text can cause delays or even result in incorrect treatment for patients. Recently, language models have shown promise in their ability to automatically detect errors in medical text, an ability that has...
Iterative Prompt Refinement for Dyslexia-Friendly Text Summarization Using GPT-4o
arXiv:2602.22524v1 Announce Type: new Abstract: Dyslexia affects approximately 10% of the global population and presents persistent challenges in reading fluency and text comprehension. While existing assistive technologies address visual presentation, linguistic complexity remains a substantial barrier to equitable access. This...
Towards Faithful Industrial RAG: A Reinforced Co-adaptation Framework for Advertising QA
arXiv:2602.22584v1 Announce Type: new Abstract: Industrial advertising question answering (QA) is a high-stakes task in which hallucinated content, particularly fabricated URLs, can lead to financial loss, compliance violations, and legal risk. Although Retrieval-Augmented Generation (RAG) is widely adopted, deploying it...
The Innocence Trap lawreview - Minnesota Law Review
By CAITLIN GLASS & JULIAN GREEN. Full Text. What makes a conviction wrongful? Developments in DNA science have led to a wave of exonerations over the past thirty years, revealing sources of error in the criminal legal process. Innocence organizations...
Waging the Battle for Society’s Soul: The Constitutionality of Juvenile Transfer Legislation in the Wake of Jones v. Mississippi lawreview - Minnesota Law Review
By LOGAN KNUTSON. Full Text. Trying juvenile defendants as adults is a cruel, yet enduring practice in U.S. criminal law. If convicted, these youthful offenders face brutal conditions in adult prison and a lifelong stigma. Although these devastating consequences of...
The Skidmore Compromise: Interpreting Skidmore as a Tiebreaker to Preserve Judicial Wisdom in the Era of Loper Bright lawreview - Minnesota Law Review
By MITCHELL ZAIC. Full Text. 'Law must be stable, and yet it cannot stand still.' Here is the great antinomy confronting us at every turn. Rest and motion, unrelieved and unchecked, are equally destructive. The law, like human kind, if...
The Crisis in U.S. Cancer Care: Law, Markets, and Privatization lawreview - Minnesota Law Review
By DANIEL G. AARON. Full Text. Cancer is surging among youth and young adults in the United States, yet, instead of public regulation addressing its root causes, we have outsourced the management of cancer to the private sector. A suite...
Volume 110 – Issue 3 - Minnesota Law Review
ESG Investing Under Scrutiny: Legal and Regulatory Developments in 2026
ESG investing faces both increased regulatory support in some jurisdictions and political backlash in others, creating a complex compliance landscape.