Hidden in the Multiplicative Interaction: Uncovering Fragility in Multimodal Contrastive Learning
arXiv:2604.05834v1 Announce Type: new Abstract: Multimodal contrastive learning is increasingly enriched by going beyond image-text pairs. Among recent contrastive methods, Symile is a strong approach for this challenge because its multiplicative interaction objective captures higher-order cross-modal dependence. Yet, we find...
Not All Turns Are Equally Hard: Adaptive Thinking Budgets For Efficient Multi-Turn Reasoning
arXiv:2604.05164v1 Announce Type: new Abstract: As LLM reasoning performance plateau, improving inference-time compute efficiency is crucial to mitigate overthinking and long thinking traces even for simple queries. Prior approaches including length regularization, adaptive routing, and difficulty-based budget allocation primarily focus...
ICR-Drive: Instruction Counterfactual Robustness for End-to-End Language-Driven Autonomous Driving
arXiv:2604.05378v1 Announce Type: new Abstract: Recent progress in vision-language-action (VLA) models has enabled language-conditioned driving agents to execute natural-language navigation commands in closed-loop simulation, yet standard evaluations largely assume instructions are precise and well-formed. In deployment, instructions vary in phrasing...
TRACE: Capability-Targeted Agentic Training
arXiv:2604.05336v1 Announce Type: new Abstract: Large Language Models (LLMs) deployed in agentic environments must exercise multiple capabilities across different task instances, where a capability is performing one or more actions in a trajectory that are necessary for successfully solving a...
From Retinal Evidence to Safe Decisions: RETINA-SAFE and ECRT for Hallucination Risk Triage in Medical LLMs
arXiv:2604.05348v1 Announce Type: new Abstract: Hallucinations in medical large language models (LLMs) remain a safety-critical issue, particularly when available evidence is insufficient or conflicting. We study this problem in diabetic retinopathy (DR) decision settings and introduce RETINA-SAFE, an evidence-grounded benchmark...
UniCreative: Unifying Long-form Logic and Short-form Sparkle via Reference-Free Reinforcement Learning
arXiv:2604.05517v1 Announce Type: new Abstract: A fundamental challenge in creative writing lies in reconciling the inherent tension between maintaining global coherence in long-form narratives and preserving local expressiveness in short-form texts. While long-context generation necessitates explicit macroscopic planning, short-form creativity...
Simulating the Evolution of Alignment and Values in Machine Intelligence
arXiv:2604.05274v1 Announce Type: new Abstract: Model alignment is currently applied in a vacuum, evaluated primarily through standardised benchmark performance. The purpose of this study is to examine the effects of alignment on populations of models through time. We focus on...
LLMs Should Express Uncertainty Explicitly
arXiv:2604.05306v1 Announce Type: new Abstract: Large language models are increasingly used in settings where uncertainty must drive decisions such as abstention, retrieval, and verification. Most existing methods treat uncertainty as a latent quantity to estimate after generation rather than a...
MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU
arXiv:2604.05091v1 Announce Type: new Abstract: We present MegaTrain, a memory-centric system that efficiently trains 100B+ parameter large language models at full precision on a single GPU. Unlike traditional GPU-centric systems, MegaTrain stores parameters and optimizer states in host memory (CPU...
Right at My Level: A Unified Multilingual Framework for Proficiency-Aware Text Simplification
arXiv:2604.05302v1 Announce Type: new Abstract: Text simplification supports second language (L2) learning by providing comprehensible input, consistent with the Input Hypothesis. However, constructing personalized parallel corpora is costly, while existing large language model (LLM)-based readability control methods rely on pre-labeled...
ActivityEditor: Learning to Synthesize Physically Valid Human Mobility
arXiv:2604.05529v1 Announce Type: new Abstract: Human mobility modeling is indispensable for diverse urban applications. However, existing data-driven methods often suffer from data scarcity, limiting their applicability in regions where historical trajectories are unavailable or restricted. To bridge this gap, we...
Stop Fixating on Prompts: Reasoning Hijacking and Constraint Tightening for Red-Teaming LLM Agents
arXiv:2604.05549v1 Announce Type: new Abstract: With the widespread application of LLM-based agents across various domains, their complexity has introduced new security threats. Existing red-team methods mostly rely on modifying user prompts, which lack adaptability to new data and may impact...
Adaptive Serverless Resource Management via Slot-Survival Prediction and Event-Driven Lifecycle Control
arXiv:2604.05465v1 Announce Type: new Abstract: Serverless computing eliminates infrastructure management overhead but introduces significant challenges regarding cold start latency and resource utilization. Traditional static resource allocation often leads to inefficiencies under variable workloads, resulting in performance degradation or excessive costs....
Bypassing the CSI Bottleneck: MARL-Driven Spatial Control for Reflector Arrays
arXiv:2604.05162v1 Announce Type: new Abstract: Reconfigurable Intelligent Surfaces (RIS) are pivotal for next-generation smart radio environments, yet their practical deployment is severely bottlenecked by the intractable computational overhead of Channel State Information (CSI) estimation. To bypass this fundamental physical-layer barrier,...
AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discovery
arXiv:2604.05550v1 Announce Type: new Abstract: Artificial intelligence research increasingly depends on prolonged cycles of reproduction, debugging, and iterative refinement to achieve State-Of-The-Art (SOTA) performance, creating a growing need for systems that can accelerate the full pipeline of empirical model optimization....
Learning What Matters: Dynamic Dimension Selection and Aggregation for Interpretable Vision-Language Reward Modeling
arXiv:2604.05445v1 Announce Type: new Abstract: Vision-language reward modeling faces a dilemma: generative approaches are interpretable but slow, while discriminative ones are efficient but act as opaque "black boxes." To bridge this gap, we propose VL-MDR (Vision-Language Multi-Dimensional Reward), a framework...
Territory Paint Wars: Diagnosing and Mitigating Failure Modes in Competitive Multi-Agent PPO
arXiv:2604.04983v1 Announce Type: new Abstract: We present Territory Paint Wars, a minimal competitive multi-agent reinforcement learning environment implemented in Unity, and use it to systematically investigate failure modes of Proximal Policy Optimisation (PPO) under self-play. A first agent trained for...
PRIME: Prototype-Driven Multimodal Pretraining for Cancer Prognosis with Missing Modalities
arXiv:2604.04999v1 Announce Type: new Abstract: Multimodal self-supervised pretraining offers a promising route to cancer prognosis by integrating histopathology whole-slide images, gene expression, and pathology reports, yet most existing approaches require fully paired and complete inputs. In practice, clinical cohorts are...
Dialogue Act Patterns in GenAI-Mediated L2 Oral Practice: A Sequential Analysis of Learner-Chatbot Interactions
arXiv:2604.05702v1 Announce Type: new Abstract: While generative AI (GenAI) voice chatbots offer scalable opportunities for second language (L2) oral practice, the interactional processes related to learners' gains remain underexplored. This study investigates dialogue act (DA) patterns in interactions between Grade...
Phase-Associative Memory: Sequence Modeling in Complex Hilbert Space
arXiv:2604.05030v1 Announce Type: new Abstract: We present Phase-Associative Memory (PAM), a recurrent sequence model in which all representations are complex-valued, associations accumulate in a matrix state $S_{t}$ $\in$ $\mathbb{C}^{d \times d}$ via outer products, and retrieval operates through the conjugate...
Reproducing AlphaZero on Tablut: Self-Play RL for an Asymmetric Board Game
arXiv:2604.05476v1 Announce Type: new Abstract: This work investigates the adaptation of the AlphaZero reinforcement learning algorithm to Tablut, an asymmetric historical board game featuring unequal piece counts and distinct player objectives (king capture versus king escape). While the original AlphaZero...
Improving Sparse Memory Finetuning
arXiv:2604.05248v1 Announce Type: new Abstract: Large Language Models (LLMs) are typically static after training, yet real-world applications require continual adaptation to new knowledge without degrading existing capabilities. Standard approaches to updating models, like full finetuning or parameter-efficient methods (e.g., LoRA),...
DIA-HARM: Dialectal Disparities in Harmful Content Detection Across 50 English Dialects
arXiv:2604.05318v1 Announce Type: new Abstract: Harmful content detectors-particularly disinformation classifiers-are predominantly developed and evaluated on Standard American English (SAE), leaving their robustness to dialectal variation unexplored. We present DIA-HARM, the first benchmark for evaluating disinformation detection robustness across 50 English...
Graph Topology Information Enhanced Heterogeneous Graph Representation Learning
arXiv:2604.05732v1 Announce Type: new Abstract: Real-world heterogeneous graphs are inherently noisy and usually not in the optimal graph structures for downstream tasks, which often adversely affects the performance of GRL models in downstream tasks. Although Graph Structure Learning (GSL) methods...
Shadow Derivatives: The Quiet Propertization of AI Learning
Introduction Artificial intelligence (AI) systems learn. In today’s AI markets, durable advantage comes less from any single output than from the learning that accumulates through training, fine-tuning, and downstream feedback loops.[1] Each interaction, correction, and deployment contributes incrementally to improved...
ALTO: Adaptive LoRA Tuning and Orchestration for Heterogeneous LoRA Training Workloads
arXiv:2604.05426v1 Announce Type: new Abstract: Low-Rank Adaptation (LoRA) is now the dominant method for parameter-efficient fine-tuning of large language models, but achieving a high-quality adapter often requires systematic hyperparameter tuning because LoRA performance is highly sensitive to configuration choices. In...
Extending Tabular Denoising Diffusion Probabilistic Models for Time-Series Data Generation
arXiv:2604.05257v1 Announce Type: new Abstract: Diffusion models are increasingly being utilised to create synthetic tabular and time series data for privacy-preserving augmentation. Tabular Denoising Diffusion Probabilistic Models (TabDDPM) generate high-quality synthetic data from heterogeneous tabular datasets but assume independence between...
PCA-Driven Adaptive Sensor Triage for Edge AI Inference
arXiv:2604.05045v1 Announce Type: new Abstract: Multi-channel sensor networks in industrial IoT often exceed available bandwidth. We propose PCA-Triage, a streaming algorithm that converts incremental PCA loadings into proportional per-channel sampling rates under a bandwidth budget. PCA-Triage runs in O(wdk) time...
Enhancing sample efficiency in reinforcement-learning-based flow control: replacing the critic with an adaptive reduced-order model
arXiv:2604.04986v1 Announce Type: new Abstract: Model-free deep reinforcement learning (DRL) methods suffer from poor sample efficiency. To overcome this limitation, this work introduces an adaptive reduced-order-model (ROM)-based reinforcement learning framework for active flow control. In contrast to conventional actor--critic architectures,...
LLM Reasoning as Trajectories: Step-Specific Representation Geometry and Correctness Signals
arXiv:2604.05655v1 Announce Type: new Abstract: This work characterizes large language models' chain-of-thought generation as a structured trajectory through representation space. We show that mathematical reasoning traverses functionally ordered, step-specific subspaces that become increasingly separable with layer depth. This structure already...