Learning Beyond Optimization: Stress-Gated Dynamical Regime Regulation in Autonomous Systems
arXiv:2602.18581v1 Announce Type: new Abstract: Despite their apparent diversity, modern machine learning methods can be reduced to a remarkably simple core principle: learning is achieved by continuously optimizing parameters to minimize or maximize a scalar objective function. This paradigm has...
Learning Invariant Visual Representations for Planning with Joint-Embedding Predictive World Models
arXiv:2602.18639v1 Announce Type: new Abstract: World models learned from high-dimensional visual observations allow agents to make decisions and plan directly in latent space, avoiding pixel-level reconstruction. However, recent latent predictive architectures (JEPAs), including the DINO world model (DINO-WM), display a...
Transformers for dynamical systems learn transfer operators in-context
arXiv:2602.18679v1 Announce Type: new Abstract: Large-scale foundation models for scientific machine learning adapt to physical settings unseen during training, such as zero-shot transfer between turbulent scales. This phenomenon, in-context learning, challenges conventional understanding of learning and adaptation in physical systems....
Bayesian Lottery Ticket Hypothesis
arXiv:2602.18825v1 Announce Type: new Abstract: Bayesian neural networks (BNNs) are a useful tool for uncertainty quantification, but require substantially more computational resources than conventional neural networks. For non-Bayesian networks, the Lottery Ticket Hypothesis (LTH) posits the existence of sparse subnetworks...
MultiVer: Zero-Shot Multi-Agent Vulnerability Detection
arXiv:2602.17875v1 Announce Type: cross Abstract: We present MultiVer, a zero-shot multi-agent system for vulnerability detection that achieves state-of-the-art recall without fine-tuning. A four-agent ensemble (security, correctness, performance, style) with union voting achieves 82.7% recall on PyVul, exceeding fine-tuned GPT-3.5 (81.3%)...
TFL: Targeted Bit-Flip Attack on Large Language Model
arXiv:2602.17837v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly deployed in safety and security critical applications, raising concerns about their robustness to model parameter fault injection attacks. Recent studies have shown that bit-flip attacks (BFAs), which exploit computer...
Beyond Context Sharing: A Unified Agent Communication Protocol (ACP) for Secure, Federated, and Autonomous Agent-to-Agent (A2A) Orchestration
arXiv:2602.15055v1 Announce Type: cross Abstract: In the artificial intelligence space, as we transition from isolated large language models to autonomous agents capable of complex reasoning and tool use. While foundational architectures and local context management protocols have been established, the...
Exploiting Layer-Specific Vulnerabilities to Backdoor Attack in Federated Learning
arXiv:2602.15161v1 Announce Type: cross Abstract: Federated learning (FL) enables distributed model training across edge devices while preserving data locality. This decentralized approach has emerged as a promising solution for collaborative learning on sensitive user data, effectively addressing the longstanding privacy...
The Vision Wormhole: Latent-Space Communication in Heterogeneous Multi-Agent Systems
arXiv:2602.15382v1 Announce Type: new Abstract: Multi-Agent Systems (MAS) powered by Large Language Models have unlocked advanced collaborative reasoning, yet they remain shackled by the inefficiency of discrete text communication, which imposes significant runtime overhead and information quantization loss. While latent...
Multi-agent cooperation through in-context co-player inference
arXiv:2602.16301v1 Announce Type: new Abstract: Achieving cooperation among self-interested agents remains a fundamental challenge in multi-agent reinforcement learning. Recent work showed that mutual cooperation can be induced between "learning-aware" agents that account for and shape the learning dynamics of their...
The Perplexity Paradox: Why Code Compresses Better Than Math in LLM Prompts
arXiv:2602.15843v1 Announce Type: cross Abstract: In "Compress or Route?" (Johnson, 2026), we found that code generation tolerates aggressive prompt compression (r >= 0.6) while chain-of-thought reasoning degrades gradually. That study was limited to HumanEval (164 problems), left the "perplexity paradox"...
AgentLAB: Benchmarking LLM Agents against Long-Horizon Attacks
arXiv:2602.16901v1 Announce Type: new Abstract: LLM agents are increasingly deployed in long-horizon, complex environments to solve challenging problems, but this expansion exposes them to long-horizon attacks that exploit multi-turn user-agent-environment interactions to achieve objectives infeasible in single-turn settings. To measure...
Fundamental Limits of Black-Box Safety Evaluation: Information-Theoretic and Computational Barriers from Latent Context Conditioning
arXiv:2602.16984v1 Announce Type: new Abstract: Black-box safety evaluation of AI systems assumes model behavior on test distributions reliably predicts deployment performance. We formalize and challenge this assumption through latent context-conditioned policies -- models whose outputs depend on unobserved internal variables...
Toward Trustworthy Evaluation of Sustainability Rating Methodologies: A Human-AI Collaborative Framework for Benchmark Dataset Construction
arXiv:2602.17106v1 Announce Type: new Abstract: Sustainability or ESG rating agencies use company disclosures and external data to produce scores or ratings that assess the environmental, social, and governance performance of a company. However, sustainability ratings across agencies for a single...
From Labor to Collaboration: A Methodological Experiment Using AI Agents to Augment Research Perspectives in Taiwan's Humanities and Social Sciences
arXiv:2602.17221v1 Announce Type: new Abstract: Generative AI is reshaping knowledge work, yet existing research focuses predominantly on software engineering and the natural sciences, with limited methodological exploration for the humanities and social sciences. Positioned as a "methodological experiment," this study...
Decoding the Human Factor: High Fidelity Behavioral Prediction for Strategic Foresight
arXiv:2602.17222v1 Announce Type: new Abstract: Predicting human decision-making in high-stakes environments remains a central challenge for artificial intelligence. While large language models (LLMs) demonstrate strong general reasoning, they often struggle to generate consistent, individual-specific behavior, particularly when accurate prediction depends...
One-step Language Modeling via Continuous Denoising
arXiv:2602.16813v1 Announce Type: new Abstract: Language models based on discrete diffusion have attracted widespread interest for their potential to provide faster generation than autoregressive models. In practice, however, they exhibit a sharp degradation of sample quality in the few-step regime,...
Small LLMs for Medical NLP: a Systematic Analysis of Few-Shot, Constraint Decoding, Fine-Tuning and Continual Pre-Training in Italian
arXiv:2602.17475v1 Announce Type: new Abstract: Large Language Models (LLMs) consistently excel in diverse medical Natural Language Processing (NLP) tasks, yet their substantial computational requirements often limit deployment in real-world healthcare settings. In this work, we investigate whether "small" LLMs (around...
Bridging the Domain Divide: Supervised vs. Zero-Shot Clinical Section Segmentation from MIMIC-III to Obstetrics
arXiv:2602.17513v1 Announce Type: new Abstract: Clinical free-text notes contain vital patient information. They are structured into labelled sections; recognizing these sections has been shown to support clinical decision-making and downstream NLP tasks. In this paper, we advance clinical section segmentation...
Malliavin Calculus as Stochastic Backpropogation
arXiv:2602.17013v1 Announce Type: new Abstract: We establish a rigorous connection between pathwise (reparameterization) and score-function (Malliavin) gradient estimators by showing that both arise from the Malliavin integration-by-parts identity. Building on this equivalence, we introduce a unified and variance-aware hybrid estimator...
Transforming Behavioral Neuroscience Discovery with In-Context Learning and AI-Enhanced Tensor Methods
arXiv:2602.17027v1 Announce Type: new Abstract: Scientific discovery pipelines typically involve complex, rigid, and time-consuming processes, from data preparation to analyzing and interpreting findings. Recent advances in AI have the potential to transform such pipelines in a way that domain experts...
A breakdown of the court’s tariff decision
Empirical SCOTUS is a recurring series by Adam Feldman that looks at Supreme Court data, primarily in the form of opinions and oral arguments, to provide insights into the justices’ decision making and […]The postA breakdown of the court’s tariff...
Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis
arXiv:2602.15909v1 Announce Type: cross Abstract: Deep learning-based respiratory auscultation is currently hindered by two fundamental challenges: (i) inherent information loss, as converting signals into spectrograms discards transient acoustic events and clinical context; (ii) limited data availability, exacerbated by severe class...
VDLM: Variable Diffusion LMs via Robust Latent-to-Text Rendering
arXiv:2602.15870v1 Announce Type: new Abstract: Autoregressive language models decode left-to-right with irreversible commitments, limiting revision during multi-step reasoning. We propose \textbf{VDLM}, a modular variable diffusion language model that separates semantic planning from text rendering. VDLM applies LLaDA-style masked diffusion over...
CEPAE: Conditional Entropy-Penalized Autoencoders for Time Series Counterfactuals
arXiv:2602.15546v1 Announce Type: new Abstract: The ability to accurately perform counterfactual inference on time series is crucial for decision-making in fields like finance, healthcare, and marketing, as it allows us to understand the impact of events or treatments on outcomes...
Statement Regarding API Security Incident | OpenReview
Why is Normalization Preferred? A Worst-Case Complexity Theory for Stochastically Preconditioned SGD under Heavy-Tailed Noise
arXiv:2602.13413v1 Announce Type: new Abstract: We develop a worst-case complexity theory for stochastically preconditioned stochastic gradient descent (SPSGD) and its accelerated variants under heavy-tailed noise, a setting that encompasses widely used adaptive methods such as Adam, RMSProp, and Shampoo. We...
Scenario-Adaptive MU-MIMO OFDM Semantic Communication With Asymmetric Neural Network
arXiv:2602.13557v1 Announce Type: new Abstract: Semantic Communication (SemCom) has emerged as a promising paradigm for 6G networks, aiming to extract and transmit task-relevant information rather than minimizing bit errors. However, applying SemCom to realistic downlink Multi-User Multi-Input Multi-Output (MU-MIMO) Orthogonal...