SR-TTT: Surprisal-Aware Residual Test-Time Training
arXiv:2603.06642v1 Announce Type: new Abstract: Test-Time Training (TTT) language models achieve theoretically infinite context windows with an O(1) memory footprint by replacing the standard exact-attention KV-cache with hidden state ``fast weights'' W_fast updated via self-supervised learning during inference. However, pure...
Trust Aware Federated Learning for Secure Bone Healing Stage Interpretation in e-Health
arXiv:2603.06646v1 Announce Type: new Abstract: This paper presents a trust aware federated learning (FL) framework for interpreting bone healing stages using spectral features derived from frequency response data. The primary objective is to address the challenge posed by either unreliable...
One step further with Monte-Carlo sampler to guide diffusion better
arXiv:2603.06685v1 Announce Type: new Abstract: Stochastic differential equation (SDE)-based generative models have achieved substantial progress in conditional generation via training-free differentiable loss-guided approaches. However, existing methodologies utilizing posterior sam- pling typically confront a substantial estimation error, which results in inaccu-...
Scaling Agentic Capabilities, Not Context: Efficient Reinforcement Finetuning for Large Toolspaces
arXiv:2603.06713v1 Announce Type: new Abstract: Agentic systems operating over large tool ecosystems must plan and execute long-horizon workflows under weak or non-verifiable supervision. While frontier models mitigate these challenges through scale and large context budgets, small language models (SLMs) remain...
ProtAlign: Contrastive learning paradigm for Sequence and structure alignment
arXiv:2603.06722v1 Announce Type: new Abstract: Protein language models often take into consideration the alignment between a protein sequence and its textual description. However, they do not take structural information into consideration. Traditional methods treat sequence and structure separately, limiting the...
Bi Directional Feedback Fusion for Activity Aware Forecasting of Indoor CO2 and PM2.5
arXiv:2603.06724v1 Announce Type: new Abstract: Indoor air quality (IAQ) forecasting plays a critical role in safeguarding occupant health, ensuring thermal comfort, and supporting intelligent building control. However, predicting future concentrations of key pollutants such as carbon dioxide (CO2) and fine...
Regression Models Meet Foundation Models: A Hybrid-AI Approach to Practical Electricity Price Forecasting
arXiv:2603.06726v1 Announce Type: new Abstract: Electricity market prices exhibit extreme volatility, nonlinearity, and non-stationarity, making accurate forecasting a significant challenge. While cutting-edge time series foundation models (TSFMs) effectively capture temporal dependencies, they typically underutilize cross-variate correlations and non-periodic patterns that...
Safe Transformer: An Explicit Safety Bit For Interpretable And Controllable Alignment
arXiv:2603.06727v1 Announce Type: new Abstract: Current safety alignment methods encode safe behavior implicitly within model parameters, creating a fundamental opacity: we cannot easily inspect why a model refuses a request, nor intervene when its safety judgments fail. We propose Safe...
Don't Freeze, Don't Crash: Extending the Safe Operating Range of Neural Navigation in Dense Crowds
arXiv:2603.06729v1 Announce Type: new Abstract: Navigating safely through dense crowds requires collision avoidance that generalizes beyond the densities seen during training. Learning-based crowd navigation can break under out-of-distribution crowd sizes due to density-sensitive observation normalization and social-cost scaling, while analytical...
Heterogeneous Decentralized Diffusion Models
arXiv:2603.06741v1 Announce Type: new Abstract: Training frontier-scale diffusion models often requires substantial computational resources concentrated in tightly coupled clusters, limiting participation to well-resourced institutions. While Decentralized Diffusion Models (DDM) enable training multiple experts in isolation, existing approaches require 1176 GPU-days...
Latent Autoencoder Ensemble Kalman Filter for Data assimilation
arXiv:2603.06752v1 Announce Type: new Abstract: The ensemble Kalman filter (EnKF) is widely used for data assimilation in high-dimensional systems, but its performance often deteriorates for strongly nonlinear dynamics due to the structural mismatch between the Kalman update and the underlying...
Governor DeSantis Directs Florida State Agencies to Partner with Future of Life Institute to Shield Families from AI Harm
The collaboration will produce a Crisis Counselor Training Curriculum and a statewide AI Harms Reporting Form targeting dangerous AI companion applications
OpenAI and Google employees rush to Anthropic’s defense in DOD lawsuit
More than 30 OpenAI and Google DeepMind employees signed onto a statement supporting Anthropic's lawsuit against the Defense Department after the agency labeled the AI firm a supply-chain risk, according to court filings.
Anthropic launches code review tool to check flood of AI-generated code
Anthropic launched Code Review in Claude Code, a multi-agent system that automatically analyzes AI-generated code, flags logic errors, and helps enterprise developers manage the growing volume of code produced with AI.
OpenAI acquires Promptfoo to secure its AI agents
This deal underscores how frontier labs are scrambling to prove their technology can be used safely in critical business operations.
Anthropic sues Defense Department over supply-chain risk designation
Anthropic filed suit against the Department of Defense on Monday after the agency labeled it a supply-chain risk. The complaint calls the DOD's actions "unprecedented and unlawful."
Sandberg, Clegg join Nscale board as this ‘Stargate Norway’ startup hits $14.6B valuation
Nvidia-backed British AI infrastructure startup Nscale has raised another megaround of $2 billion.
EigenData: A Self-Evolving Multi-Agent Platform for Function-Calling Data Synthesis, Auditing, and Repair
arXiv:2603.05553v1 Announce Type: cross Abstract: Function-calling agents -- large language models that invoke tools and APIs -- require high-quality, domain-specific training data spanning executable environments, backing databases, and diverse multi-turn trajectories. We introduce EigenData, an integrated, self-evolving platform that automates...
SAHOO: Safeguarded Alignment for High-Order Optimization Objectives in Recursive Self-Improvement
arXiv:2603.06333v1 Announce Type: new Abstract: Recursive self-improvement is moving from theory to practice: modern systems can critique, revise, and evaluate their own outputs, yet iterative self-modification risks subtle alignment drift. We introduce SAHOO, a practical framework to monitor and control...
Boosting deep Reinforcement Learning using pretraining with Logical Options
arXiv:2603.06565v1 Announce Type: new Abstract: Deep reinforcement learning agents are often misaligned, as they over-exploit early reward signals. Recently, several symbolic approaches have addressed these challenges by encoding sparse objectives along with aligned plans. However, purely symbolic architectures are complex...
PRISM: Personalized Refinement of Imitation Skills for Manipulation via Human Instructions
arXiv:2603.05574v1 Announce Type: cross Abstract: This paper presents PRISM: an instruction-conditioned refinement method for imitation policies in robotic manipulation. This approach bridges Imitation Learning (IL) and Reinforcement Learning (RL) frameworks into a seamless pipeline, such that an imitation policy on...
Spatiotemporal Heterogeneity of AI-Driven Traffic Flow Patterns and Land Use Interaction: A GeoAI-Based Analysis of Multimodal Urban Mobility
arXiv:2603.05581v1 Announce Type: cross Abstract: Urban traffic flow is governed by the complex, nonlinear interaction between land use configuration and spatiotemporally heterogeneous mobility demand. Conventional global regression and time-series models cannot simultaneously capture these multi-scale dynamics across multiple travel modes....
Exploring Human-in-the-Loop Themes in AI Application Development: An Empirical Thematic Analysis
arXiv:2603.05510v1 Announce Type: cross Abstract: Developing and deploying AI applications in organizations is challenging when human decision authority and oversight are underspecified across the system lifecycle. Although Human-in-the-Loop (HITL) and Human-Centered AI (HCAI) principles are widely acknowledged, operational guidance for...
Conversational Demand Response: Bidirectional Aggregator-Prosumer Coordination through Agentic AI
arXiv:2603.06217v1 Announce Type: new Abstract: Residential demand response depends on sustained prosumer participation, yet existing coordination is either fully automated, or limited to one-way dispatch signals and price alerts that offer little possibility for informed decision-making. This paper introduces Conversational...
On the Reliability of AI Methods in Drug Discovery: Evaluation of Boltz-2 for Structure and Binding Affinity Prediction
arXiv:2603.05532v1 Announce Type: cross Abstract: Despite continuing hype about the role of AI in drug discovery, no "AI-discovered drugs" have so far received regulatory approval. Here we assess one of the latest AI based tools in this domain. The ability...
RoboLayout: Differentiable 3D Scene Generation for Embodied Agents
arXiv:2603.05522v1 Announce Type: new Abstract: Recent advances in vision language models (VLMs) have shown strong potential for spatial reasoning and 3D scene layout generation from open-ended language instructions. However, generating layouts that are not only semantically coherent but also feasible...
Reasoning Models Struggle to Control their Chains of Thought
arXiv:2603.05706v1 Announce Type: new Abstract: Chain-of-thought (CoT) monitoring is a promising tool for detecting misbehaviors and understanding the motivations of modern reasoning models. However, if models can control what they verbalize in their CoT, it could undermine CoT monitorability. To...
From Toil to Thought: Designing for Strategic Exploration and Responsible AI in Systematic Literature Reviews
arXiv:2603.05514v1 Announce Type: cross Abstract: Systematic Literature Reviews (SLRs) are fundamental to scientific progress, yet the process is hindered by a fragmented tool ecosystem that imposes a high cognitive load. This friction suppresses the iterative, exploratory nature of scholarly work....
Omni-C: Compressing Heterogeneous Modalities into a Single Dense Encoder
arXiv:2603.05528v1 Announce Type: cross Abstract: Recent multimodal systems often rely on separate expert modality encoders which cause linearly scaling complexity and computational overhead with added modalities. While unified Omni-models address this via Mixture-of-Expert (MoE) architectures with specialized experts and routing,...
JAWS: Enhancing Long-term Rollout of Neural Operators via Spatially-Adaptive Jacobian Regularization
arXiv:2603.05538v1 Announce Type: cross Abstract: Data-driven surrogate models improve the efficiency of simulating continuous dynamical systems, yet their autoregressive rollouts are often limited by instability and spectral blow-up. While global regularization techniques can enforce contractive dynamics, they uniformly damp high-frequency...