Graph Property Inference in Small Language Models: Effects of Representation and Inference Strategy
arXiv:2603.06635v1 Announce Type: new Abstract: Recent progress in language modeling has expanded the range of tasks that can be approached through natural language interfaces, including problems that require structured reasoning. However, it remains unclear how effectively limited-capacity language models can...
SmartBench: Evaluating LLMs in Smart Homes with Anomalous Device States and Behavioral Contexts
arXiv:2603.06636v1 Announce Type: new Abstract: Due to the strong context-awareness capabilities demonstrated by large language models (LLMs), recent research has begun exploring their integration into smart home assistants to help users manage and adjust their living environments. While LLMs have...
HEARTS: Benchmarking LLM Reasoning on Health Time Series
arXiv:2603.06638v1 Announce Type: new Abstract: The rise of large language models (LLMs) has shifted time series analysis from narrow analytics to general-purpose reasoning. Yet, existing benchmarks cover only a small set of health time series modalities and tasks, failing to...
SR-TTT: Surprisal-Aware Residual Test-Time Training
arXiv:2603.06642v1 Announce Type: new Abstract: Test-Time Training (TTT) language models achieve theoretically infinite context windows with an O(1) memory footprint by replacing the standard exact-attention KV-cache with hidden state ``fast weights'' W_fast updated via self-supervised learning during inference. However, pure...
Trust Aware Federated Learning for Secure Bone Healing Stage Interpretation in e-Health
arXiv:2603.06646v1 Announce Type: new Abstract: This paper presents a trust aware federated learning (FL) framework for interpreting bone healing stages using spectral features derived from frequency response data. The primary objective is to address the challenge posed by either unreliable...
HURRI-GAN: A Novel Approach for Hurricane Bias-Correction Beyond Gauge Stations using Generative Adversarial Networks
arXiv:2603.06649v1 Announce Type: new Abstract: The coastal regions of the eastern and southern United States are impacted by severe storm events, leading to significant loss of life and properties. Accurately forecasting storm surge and wind impacts from hurricanes is essential...
Geodesic Gradient Descent: A Generic and Learning-rate-free Optimizer on Objective Function-induced Manifolds
arXiv:2603.06651v1 Announce Type: new Abstract: Euclidean gradient descent algorithms barely capture the geometry of objective function-induced hypersurfaces and risk driving update trajectories off the hypersurfaces. Riemannian gradient descent algorithms address these issues but fail to represent complex hypersurfaces via a...
ERP-RiskBench: Leakage-Safe Ensemble Learning for Financial Risk
arXiv:2603.06671v1 Announce Type: new Abstract: Financial risk detection in Enterprise Resource Planning (ERP) systems is an important but underexplored application of machine learning. Published studies in this area tend to suffer from vague dataset descriptions, leakage-prone pipelines, and evaluation practices...
Scaling Agentic Capabilities, Not Context: Efficient Reinforcement Finetuning for Large Toolspaces
arXiv:2603.06713v1 Announce Type: new Abstract: Agentic systems operating over large tool ecosystems must plan and execute long-horizon workflows under weak or non-verifiable supervision. While frontier models mitigate these challenges through scale and large context budgets, small language models (SLMs) remain...
From Statistical Fidelity to Clinical Consistency: Scalable Generation and Auditing of Synthetic Patient Trajectories
arXiv:2603.06720v1 Announce Type: new Abstract: Access to electronic health records (EHRs) for digital health research is often limited by privacy regulations and institutional barriers. Synthetic EHRs have been proposed as a way to enable safe and sovereign data sharing; however,...
ProtAlign: Contrastive learning paradigm for Sequence and structure alignment
arXiv:2603.06722v1 Announce Type: new Abstract: Protein language models often take into consideration the alignment between a protein sequence and its textual description. However, they do not take structural information into consideration. Traditional methods treat sequence and structure separately, limiting the...
Bi Directional Feedback Fusion for Activity Aware Forecasting of Indoor CO2 and PM2.5
arXiv:2603.06724v1 Announce Type: new Abstract: Indoor air quality (IAQ) forecasting plays a critical role in safeguarding occupant health, ensuring thermal comfort, and supporting intelligent building control. However, predicting future concentrations of key pollutants such as carbon dioxide (CO2) and fine...
Regression Models Meet Foundation Models: A Hybrid-AI Approach to Practical Electricity Price Forecasting
arXiv:2603.06726v1 Announce Type: new Abstract: Electricity market prices exhibit extreme volatility, nonlinearity, and non-stationarity, making accurate forecasting a significant challenge. While cutting-edge time series foundation models (TSFMs) effectively capture temporal dependencies, they typically underutilize cross-variate correlations and non-periodic patterns that...
Safe Transformer: An Explicit Safety Bit For Interpretable And Controllable Alignment
arXiv:2603.06727v1 Announce Type: new Abstract: Current safety alignment methods encode safe behavior implicitly within model parameters, creating a fundamental opacity: we cannot easily inspect why a model refuses a request, nor intervene when its safety judgments fail. We propose Safe...
Don't Freeze, Don't Crash: Extending the Safe Operating Range of Neural Navigation in Dense Crowds
arXiv:2603.06729v1 Announce Type: new Abstract: Navigating safely through dense crowds requires collision avoidance that generalizes beyond the densities seen during training. Learning-based crowd navigation can break under out-of-distribution crowd sizes due to density-sensitive observation normalization and social-cost scaling, while analytical...
Improved Constrained Generation by Bridging Pretrained Generative Models
arXiv:2603.06742v1 Announce Type: new Abstract: Constrained generative modeling is fundamental to applications such as robotic control and autonomous driving, where models must respect physical laws and safety-critical constraints. In real-world settings, these constraints rarely take the form of simple linear...
Stabilizing Reinforcement Learning for Diffusion Language Models
arXiv:2603.06743v1 Announce Type: new Abstract: Group Relative Policy Optimization (GRPO) is highly effective for post-training autoregressive (AR) language models, yet its direct application to diffusion large language models (dLLMs) often triggers reward collapse. We identify two sources of incompatibility. First,...
Latent Autoencoder Ensemble Kalman Filter for Data assimilation
arXiv:2603.06752v1 Announce Type: new Abstract: The ensemble Kalman filter (EnKF) is widely used for data assimilation in high-dimensional systems, but its performance often deteriorates for strongly nonlinear dynamics due to the structural mismatch between the Kalman update and the underlying...
Court agrees to hear case on environmental laws, does not act on several Second Amendment challenges
Updated on March 9 at 5:14 p.m. The Supreme Court added just one case – a technical dispute over the interaction between two federal environmental laws – to its docket […]The postCourt agrees to hear case on environmental laws, does...
In birthright citizenship case, Justice Department urges court to treat an old concept in a new way
Immigration Matters is a recurring series by César Cuauhtémoc García Hernández that analyzes the court’s immigration docket, highlighting emerging legal questions about new policy and enforcement practices. President Donald Trump’s […]The postIn birthright citizenship case, Justice Department urges court to...
The dissent that believed the Olympics belong to everyone
In Dissent is a recurring series by Anastasia Boden on Supreme Court dissents that have shaped (or reshaped) our country. The Olympics are one of those rare moments when the […]The postThe dissent that believed the Olympics belong to everyoneappeared...
SCOTUStoday for Monday, March 9
Just 22% of U.S. registered voters have “a great deal” (7%) or “quite a bit” (15%) of confidence in the Supreme Court, according to a new NBC News poll shared […]The postSCOTUStoday for Monday, March 9appeared first onSCOTUSblog.
US blindsides states with surprise settlement in Live Nation/Ticketmaster trial
States seek mistrial, saying "sudden disappearance" of US will influence jury.
Nintendo sues to prevent Trump from dodging full tariff refunds
Nintendo may face pressure to share refunds with gamers who helped pay tariffs.
Agentic LLM Planning via Step-Wise PDDL Simulation: An Empirical Characterisation
arXiv:2603.06064v1 Announce Type: new Abstract: Task planning, the problem of sequencing actions to reach a goal from an initial state, is a core capability requirement for autonomous robotic systems. Whether large language models (LLMs) can serve as viable planners alongside...
Spatiotemporal Heterogeneity of AI-Driven Traffic Flow Patterns and Land Use Interaction: A GeoAI-Based Analysis of Multimodal Urban Mobility
arXiv:2603.05581v1 Announce Type: cross Abstract: Urban traffic flow is governed by the complex, nonlinear interaction between land use configuration and spatiotemporally heterogeneous mobility demand. Conventional global regression and time-series models cannot simultaneously capture these multi-scale dynamics across multiple travel modes....
Talk Freely, Execute Strictly: Schema-Gated Agentic AI for Flexible and Reproducible Scientific Workflows
arXiv:2603.06394v1 Announce Type: new Abstract: Large language models (LLMs) can now translate a researcher's plain-language goal into executable computation, yet scientific workflows demand determinism, provenance, and governance that are difficult to guarantee when an LLM decides what runs. Semi-structured interviews...
Tool-Genesis: A Task-Driven Tool Creation Benchmark for Self-Evolving Language Agent
arXiv:2603.05578v1 Announce Type: cross Abstract: Research on self-evolving language agents has accelerated, drawing increasing attention to their ability to create, adapt, and maintain tools from task requirements. However, existing benchmarks predominantly rely on predefined specifications, which limits scalability and hinders...
DreamCAD: Scaling Multi-modal CAD Generation using Differentiable Parametric Surfaces
arXiv:2603.05607v1 Announce Type: cross Abstract: Computer-Aided Design (CAD) relies on structured and editable geometric representations, yet existing generative methods are constrained by small annotated datasets with explicit design histories or boundary representation (BRep) labels. Meanwhile, millions of unannotated 3D meshes...
Offline Materials Optimization with CliqueFlowmer
arXiv:2603.06082v1 Announce Type: new Abstract: Recent advances in deep learning inspired neural network-based approaches to computational materials discovery (CMD). A plethora of problems in this field involve finding materials that optimize a target property. Nevertheless, the increasingly popular generative modeling...