Multi-Agent Pathfinding with Non-Unit Integer Edge Costs via Enhanced Conflict-Based Search and Graph Discretization
arXiv:2604.05416v1 Announce Type: new Abstract: Multi-Agent Pathfinding (MAPF) plays a critical role in various domains. Traditional MAPF methods typically assume unit edge costs and single-timestep actions, which limit their applicability to real-world scenarios. MAPFR extends MAPF to handle non-unit costs...
Learning-Based Multi-Criteria Decision Making Model for Sawmill Location Problems
arXiv:2604.04996v1 Announce Type: new Abstract: Strategically locating a sawmill is vital for enhancing the efficiency, profitability, and sustainability of timber supply chains. Our study proposes a Learning-Based Multi-Criteria Decision-Making (LB-MCDM) framework that integrates machine learning (ML) with GIS-based spatial location...
Uber is the latest to be won over by Amazon’s AI chips
Uber is expanding its AWS contract to run more of its ride-sharing features on Amazon's chips. This is a thumb-of-the nose at Oracle and Google.
The AI gold rush is pulling private wealth into riskier, earlier bets
On a recent episode of Equity, we talked to Arena Private Wealth to explore a growing trend: family offices bypassing VCs to gain direct exposure to AI startups, turning them from passive investors into active participants.
TableVision: A Large-Scale Benchmark for Spatially Grounded Reasoning over Complex Hierarchical Tables
arXiv:2604.03660v1 Announce Type: new Abstract: Structured tables are essential for conveying high-density information in professional domains such as finance, healthcare, and scientific research. Despite the progress in Multimodal Large Language Models (MLLMs), reasoning performance remains limited for complex tables with...
Shorter, but Still Trustworthy? An Empirical Study of Chain-of-Thought Compression
arXiv:2604.04120v1 Announce Type: new Abstract: Long chain-of-thought (Long-CoT) reasoning models have motivated a growing body of work on compressing reasoning traces to reduce inference cost, yet existing evaluations focus almost exclusively on task accuracy and token savings. Trustworthiness properties, whether...
Scalable Variational Bayesian Fine-Tuning of LLMs via Orthogonalized Low-Rank Adapters
arXiv:2604.03388v1 Announce Type: new Abstract: When deploying large language models (LLMs) to safety-critical applications, uncertainty quantification (UQ) is of utmost importance to self-assess the reliability of the LLM-based decisions. However, such decisions typically suffer from overconfidence, particularly after parameter-efficient fine-tuning...
Quantifying Trust: Financial Risk Management for Trustworthy AI Agents
arXiv:2604.03976v1 Announce Type: new Abstract: Prior work on trustworthy AI emphasizes model-internal properties such as bias mitigation, adversarial robustness, and interpretability. As AI systems evolve into autonomous agents deployed in open environments and increasingly connected to payments or assets, the...
Provable Multi-Task Reinforcement Learning: A Representation Learning Framework with Low Rank Rewards
arXiv:2604.03891v1 Announce Type: new Abstract: Multi-task representation learning (MTRL) is an approach that learns shared latent representations across related tasks, facilitating collaborative learning that improves the overall learning efficiency. This paper studies MTRL for multi-task reinforcement learning (RL), where multiple...
A Model of Understanding in Deep Learning Systems
arXiv:2604.04171v1 Announce Type: new Abstract: I propose a model of systematic understanding, suitable for machine learning systems. On this account, an agent understands a property of a target system when it contains an adequate internal model that tracks real regularities,...
I-CALM: Incentivizing Confidence-Aware Abstention for LLM Hallucination Mitigation
arXiv:2604.03904v1 Announce Type: new Abstract: Large language models (LLMs) frequently produce confident but incorrect answers, partly because common binary scoring conventions reward answering over honestly expressing uncertainty. We study whether prompt-only interventions -- explicitly announcing reward schemes for answer-versus-abstain decisions...
Toward Full Autonomous Laboratory Instrumentation Control with Large Language Models
arXiv:2604.03286v1 Announce Type: new Abstract: The control of complex laboratory instrumentation often requires significant programming expertise, creating a barrier for researchers lacking computational skills. This work explores the potential of large language models (LLMs), such as ChatGPT, and LLM-based artificial...
Vocabulary Dropout for Curriculum Diversity in LLM Co-Evolution
arXiv:2604.03472v1 Announce Type: new Abstract: Co-evolutionary self-play, where one language model generates problems and another solves them, promises autonomous curriculum learning without human supervision. In practice, the proposer quickly converges to a narrow distribution of problems that satisfy the reward...
Adaptive Threshold-Driven Continuous Greedy Method for Scalable Submodular Optimization
arXiv:2604.03419v1 Announce Type: new Abstract: Submodular maximization under matroid constraints is a fundamental problem in combinatorial optimization with applications in sensing, data summarization, active learning, and resource allocation. While the Sequential Greedy (SG) algorithm achieves only a $\frac{1}{2}$-approximation due to...
BioAlchemy: Distilling Biological Literature into Reasoning-Ready Reinforcement Learning Training Data
arXiv:2604.03506v1 Announce Type: new Abstract: Despite the large corpus of biology training text, the impact of reasoning models on biological research generally lags behind math and coding. In this work, we show that biology questions from current large-scale reasoning datasets...
SoLA: Leveraging Soft Activation Sparsity and Low-Rank Decomposition for Large Language Model Compression
arXiv:2604.03258v1 Announce Type: new Abstract: Large language models (LLMs) have demonstrated impressive capabilities across various tasks, but the billion-scale parameters pose deployment challenges. Although existing methods attempt to reduce the scale of LLMs, they require either special hardware support or...
Unveiling Language Routing Isolation in Multilingual MoE Models for Interpretable Subnetwork Adaptation
arXiv:2604.03592v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) models exhibit striking performance disparities across languages, yet the internal mechanisms driving these gaps remain poorly understood. In this work, we conduct a systematic analysis of expert routing patterns in MoE models, revealing...
When Models Know More Than They Say: Probing Analogical Reasoning in LLMs
arXiv:2604.03877v1 Announce Type: new Abstract: Analogical reasoning is a core cognitive faculty essential for narrative understanding. While LLMs perform well when surface and structural cues align, they struggle in cases where an analogy is not apparent on the surface but...
Improving Model Performance by Adapting the KGE Metric to Account for System Non-Stationarity
arXiv:2604.03906v1 Announce Type: new Abstract: Geoscientific systems tend to be characterized by pronounced temporal non-stationarity, arising from seasonal and climatic variability in hydrometeorological drivers, and from natural and anthropogenic changes to land use and cover. As has been pointed out,...
CAWN: Continuous Acoustic Wave Networks for Autoregressive Language Modeling
arXiv:2604.04250v1 Announce Type: new Abstract: Modern Large Language Models (LLMs) rely on Transformer self-attention, which scales quadratically with sequence length. Recent linear-time alternatives, like State Space Models (SSMs), often suffer from signal degradation over extended contexts. We introduce the Continuous...
POEMetric: The Last Stanza of Humanity
arXiv:2604.03695v1 Announce Type: new Abstract: Large Language Models (LLMs) can compose poetry, but how far are they from human poets? In this paper, we introduce POEMetric, the first comprehensive framework for poetry evaluation, examining 1) basic instruction-following abilities in generating...
CoALFake: Collaborative Active Learning with Human-LLM Co-Annotation for Cross-Domain Fake News Detection
arXiv:2604.04174v1 Announce Type: new Abstract: The proliferation of fake news across diverse domains highlights critical limitations in current detection systems, which often exhibit narrow domain specificity and poor generalization. Existing cross-domain approaches face two key challenges: (1) reliance on labelled...
CresOWLve: Benchmarking Creative Problem-Solving Over Real-World Knowledge
arXiv:2604.03374v1 Announce Type: new Abstract: Creative problem-solving requires combining multiple cognitive abilities, including logical reasoning, lateral thinking, analogy-making, and commonsense knowledge, to discover insights that connect seemingly unrelated pieces of information. However, most existing benchmarks for large language models (LLMs)...
ActionNex: A Virtual Outage Manager for Cloud
arXiv:2604.03512v1 Announce Type: new Abstract: Outage management in large-scale cloud operations remains heavily manual, requiring rapid triage, cross-team coordination, and experience-driven decisions under partial observability. We present \textbf{ActionNex}, a production-grade agentic system that supports end-to-end outage assistance, including real-time updates,...
CODE-GEN: A Human-in-the-Loop RAG-Based Agentic AI System for Multiple-Choice Question Generation
arXiv:2604.03926v1 Announce Type: new Abstract: We present CODE-GEN, a human-in-the-Loop, retrieval-augmented generation (RAG)-based agentic AI system for generating context-aligned multiple-choice questions to develop student code reasoning and comprehension abilities. CODE-GEN employs an agentic AI architecture in which a Generator agent...
Unmasking Hallucinations: A Causal Graph-Attention Perspective on Factual Reliability in Large Language Models
arXiv:2604.04020v1 Announce Type: new Abstract: This paper primarily focuses on the hallucinations caused due to AI language models(LLMs).LLMs have shown extraordinary Language understanding and generation capabilities .Still it has major a disadvantage hallucinations which give outputs which are factually incorrect...
Adversarial Robustness of Deep State Space Models for Forecasting
arXiv:2604.03427v1 Announce Type: new Abstract: State-space model (SSM) for time-series forecasting have demonstrated strong empirical performance on benchmark datasets, yet their robustness under adversarial perturbations is poorly understood. We address this gap through a control-theoretic lens, focusing on the recently...
Rashomon Memory: Towards Argumentation-Driven Retrieval for Multi-Perspective Agent Memory
arXiv:2604.03588v1 Announce Type: new Abstract: AI agents operating over extended time horizons accumulate experiences that serve multiple concurrent goals, and must often maintain conflicting interpretations of the same events. A concession during a client negotiation encodes as a ``trust-building investment''...
Align Your Structures: Generating Trajectories with Structure Pretraining for Molecular Dynamics
arXiv:2604.03911v1 Announce Type: new Abstract: Generating molecular dynamics (MD) trajectories using deep generative models has attracted increasing attention, yet remains inherently challenging due to the limited availability of MD data and the complexities involved in modeling high-dimensional MD distributions. To...
MultiPress: A Multi-Agent Framework for Interpretable Multimodal News Classification
arXiv:2604.03586v1 Announce Type: new Abstract: With the growing prevalence of multimodal news content, effective news topic classification demands models capable of jointly understanding and reasoning over heterogeneous data such as text and images. Existing methods often process modalities independently or...