Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination
arXiv:2602.20517v1 Announce Type: new Abstract: Effective human-AI coordination requires artificial agents capable of exhibiting and responding to human-like behaviors while adapting to changing contexts. Imitation learning has emerged as one of the prominent approaches to build such agents by training...
Physics-based phenomenological characterization of cross-modal bias in multimodal models
arXiv:2602.20624v1 Announce Type: new Abstract: The term 'algorithmic fairness' is used to evaluate whether AI models operate fairly in both comparative (where fairness is understood as formal equality, such as "treat like cases as like") and non-comparative (where unfairness arises...
Recursive Belief Vision Language Model
arXiv:2602.20659v1 Announce Type: new Abstract: Current vision-language-action (VLA) models struggle with long-horizon manipulation under partial observability. Most existing approaches remain observation-driven, relying on short context windows or repeated queries to vision-language models (VLMs). This leads to loss of task progress,...
PromptCD: Test-Time Behavior Enhancement via Polarity-Prompt Contrastive Decoding
arXiv:2602.20696v1 Announce Type: new Abstract: Reliable AI systems require large language models (LLMs) to exhibit behaviors aligned with human preferences and values. However, most existing alignment approaches operate at training time and rely on additional high-quality data, incurring significant computational...
Pressure Reveals Character: Behavioural Alignment Evaluation at Depth
arXiv:2602.20813v1 Announce Type: new Abstract: Evaluating alignment in language models requires testing how they behave under realistic pressure, not just what they claim they would do. While alignment failures increasingly cause real-world harm, comprehensive evaluation frameworks with realistic multi-turn scenarios...
Tool Building as a Path to "Superintelligence"
arXiv:2602.21061v1 Announce Type: new Abstract: The Diligent Learner framework suggests LLMs can achieve superintelligence via test-time search, provided a sufficient step-success probability $\gamma$. In this work, we design a benchmark to measure $\gamma$ on logical out-of-distribution inference. We construct a...
Benchmarking Early Deterioration Prediction Across Hospital-Rich and MCI-Like Emergency Triage Under Constrained Sensing
arXiv:2602.20168v1 Announce Type: cross Abstract: Emergency triage decisions are made under severe information constraints, yet most data-driven deterioration models are evaluated using signals unavailable during initial assessment. We present a leakage-aware benchmarking framework for early deterioration prediction that evaluates model...
Autonomous AI and Ownership Rules
arXiv:2602.20169v1 Announce Type: cross Abstract: This Article examines the circumstances in which AI-generated outputs remain linked to their creators and the points at which they lose that connection, whether through accident, deliberate design, or emergent behavior. In cases where AI...
Disentangling Geometry, Performance, and Training in Language Models
arXiv:2602.20433v1 Announce Type: new Abstract: Geometric properties of Transformer weights, particularly the unembedding matrix, have been widely useful in language model interpretability research. Yet, their utility for estimating downstream performance remains unclear. In this work, we systematically investigate the relationship...
The ASIR Courage Model: A Phase-Dynamic Framework for Truth Transitions in Human and AI Systems
arXiv:2602.21745v1 Announce Type: new Abstract: We introduce the ASIR (Awakened Shared Intelligence Relationship) Courage Model, a phase-dynamic framework that formalizes truth-disclosure as a state transition rather than a personality trait. The mode characterizes the shift from suppression (S0) to expression...
Language Models Exhibit Inconsistent Biases Towards Algorithmic Agents and Human Experts
arXiv:2602.22070v1 Announce Type: new Abstract: Large language models are increasingly used in decision-making tasks that require them to process information from a variety of sources, including both human experts and other algorithmic agents. How do LLMs weigh the information provided...
EPSVec: Efficient and Private Synthetic Data Generation via Dataset Vectors
arXiv:2602.21218v1 Announce Type: cross Abstract: High-quality data is essential for modern machine learning, yet many valuable corpora are sensitive and cannot be freely shared. Synthetic data offers a practical substitute for downstream development, and large language models (LLMs) have emerged...
Latent Context Compilation: Distilling Long Context into Compact Portable Memory
arXiv:2602.21221v1 Announce Type: cross Abstract: Efficient long-context LLM deployment is stalled by a dichotomy between amortized compression, which struggles with out-of-distribution generalization, and Test-Time Training, which incurs prohibitive synthetic data costs and requires modifying model weights, creating stateful parameters that...
Fintech Regulation 2026: Navigating the New Compliance Landscape
The regulatory environment for fintech has evolved dramatically, with new frameworks addressing digital assets, open banking, and AI-driven financial services.
AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression
arXiv:2602.21233v1 Announce Type: cross Abstract: This technical report introduces AngelSlim, a comprehensive and versatile toolkit for large model compression developed by the Tencent Hunyuan team. By consolidating cutting-edge algorithms, including quantization, speculative decoding, token pruning, and distillation. AngelSlim provides a...
Agent Behavioral Contracts: Formal Specification and Runtime Enforcement for Reliable Autonomous AI Agents
arXiv:2602.22302v1 Announce Type: new Abstract: Traditional software relies on contracts -- APIs, type systems, assertions -- to specify and enforce correct behavior. AI agents, by contrast, operate on prompts and natural language instructions with no formal behavioral specification. This gap...
Exploring Human Behavior During Abstract Rule Inference and Problem Solving with the Cognitive Abstraction and Reasoning Corpus
arXiv:2602.22408v1 Announce Type: new Abstract: Humans exhibit remarkable flexibility in abstract reasoning, and can rapidly learn and apply rules from sparse examples. To investigate the cognitive strategies underlying this ability, we introduce the Cognitive Abstraction and Reasoning Corpus (CogARC), a...
How Do Latent Reasoning Methods Perform Under Weak and Strong Supervision?
arXiv:2602.22441v1 Announce Type: new Abstract: Latent reasoning has been recently proposed as a reasoning paradigm and performs multi-step reasoning through generating steps in the latent space instead of the textual space. This paradigm enables reasoning beyond discrete language tokens by...
Mirroring the Mind: Distilling Human-Like Metacognitive Strategies into Large Language Models
arXiv:2602.22508v1 Announce Type: new Abstract: Large Reasoning Models (LRMs) often exhibit structural fragility in complex reasoning tasks, failing to produce correct answers even after successfully deriving valid intermediate steps. Through systematic analysis, we observe that these failures frequently stem not...
AHBid: An Adaptable Hierarchical Bidding Framework for Cross-Channel Advertising
arXiv:2602.22650v1 Announce Type: new Abstract: In online advertising, the inherent complexity and dynamic nature of advertising environments necessitate the use of auto-bidding services to assist advertisers in bid optimization. This complexity is further compounded in multi-channel scenarios, where effective allocation...
AMA-Bench: Evaluating Long-Horizon Memory for Agentic Applications
arXiv:2602.22769v1 Announce Type: new Abstract: Large Language Models (LLMs) are deployed as autonomous agents in increasingly complex applications, where enabling long-horizon memory is critical for achieving strong performance. However, a significant gap exists between practical applications and current evaluation standards...
The AI Research Assistant: Promise, Peril, and a Proof of Concept
arXiv:2602.22842v1 Announce Type: new Abstract: Can artificial intelligence truly contribute to creative mathematical research, or does it merely automate routine calculations while introducing risks of error? We provide empirical evidence through a detailed case study: the discovery of novel error...
SPM-Bench: Benchmarking Large Language Models for Scanning Probe Microscopy
arXiv:2602.22971v1 Announce Type: new Abstract: As LLMs achieved breakthroughs in general reasoning, their proficiency in specialized scientific domains reveals pronounced gaps in existing benchmarks due to data contamination, insufficient complexity, and prohibitive human labor costs. Here we present SPM-Bench, an...
RepSPD: Enhancing SPD Manifold Representation in EEGs via Dynamic Graphs
arXiv:2602.22981v1 Announce Type: new Abstract: Decoding brain activity from electroencephalography (EEG) is crucial for neuroscience and clinical applications. Among recent advances in deep learning for EEG, geometric learning stands out as its theoretical underpinnings on symmetric positive definite (SPD) allows...
Waging the Battle for Society’s Soul: The Constitutionality of Juvenile Transfer Legislation in the Wake of Jones v. Mississippi lawreview - Minnesota Law Review
By LOGAN KNUTSON. Full Text. Trying juvenile defendants as adults is a cruel, yet enduring practice in U.S. criminal law. If convicted, these youthful offenders face brutal conditions in adult prison and a lifelong stigma. Although these devastating consequences of...
The Crisis in U.S. Cancer Care: Law, Markets, and Privatization lawreview - Minnesota Law Review
By DANIEL G. AARON. Full Text. Cancer is surging among youth and young adults in the United States, yet, instead of public regulation addressing its root causes, we have outsourced the management of cancer to the private sector. A suite...
Regulatory History and Judicial Review lawreview - Minnesota Law Review
By TODD PHILLIPS & ANTHONY MOFFA. Full Text. The Administrative Procedure Act (APA) requires federal agencies to simply "incorporate in the rules adopted a concise general statement of their basis and purpose" after they receive comments from the public, and...
ESG Investing Under Scrutiny: Legal and Regulatory Developments in 2026
ESG investing faces both increased regulatory support in some jurisdictions and political backlash in others, creating a complex compliance landscape.
The Emerging Legal Framework for Generative AI: A Comprehensive Analysis
As generative AI transforms industries worldwide, legal systems are racing to establish frameworks that balance innovation with accountability.
Digital Sovereignty: How Nations Are Asserting Control Over Technology Infrastructure
Countries worldwide are implementing digital sovereignty measures to control data flows, technology standards, and digital infrastructure within their borders.