Locally Confident, Globally Stuck: The Quality-Exploration Dilemma in Diffusion Language Models
arXiv:2604.00375v1 Announce Type: new Abstract: Diffusion large language models (dLLMs) theoretically permit token decoding in arbitrary order, a flexibility that could enable richer exploration of reasoning paths than autoregressive (AR) LLMs. In practice, however, random-order decoding often hurts generation quality....
Improving Latent Generalization Using Test-time Compute
arXiv:2604.01430v1 Announce Type: new Abstract: Language Models (LMs) exhibit two distinct mechanisms for knowledge acquisition: in-weights learning (i.e., encoding information within the model weights) and in-context learning (ICL). Although these two modes offer complementary strengths, in-weights learning frequently struggles to...
Amazon is trying to buy Globalstar to compete with SpaceX's Starlink
Amazon wants in on the low-Earth orbit Internet action.
Malliavin Calculus for Counterfactual Gradient Estimation in Adaptive Inverse Reinforcement Learning
arXiv:2604.01345v1 Announce Type: new Abstract: Inverse reinforcement learning (IRL) recovers the loss function of a forward learner from its observed responses adaptive IRL aims to reconstruct the loss function of a forward learner by passively observing its gradients as it...
Asymmetric Actor-Critic for Multi-turn LLM Agents
arXiv:2604.00304v1 Announce Type: new Abstract: Large language models (LLMs) exhibit strong reasoning and conversational abilities, but ensuring reliable behavior in multi-turn interactions remains challenging. In many real-world applications, agents must succeed in one-shot settings where retries are impossible. Existing approaches...
The Compression Paradox in LLM Inference: Provider-Dependent Energy Effects of Prompt Compression
arXiv:2603.23528v1 Announce Type: new Abstract: The rapid proliferation of Large Language Models has created an environmental paradox: the very technology that could help solve climate challenges is itself becoming a significant contributor to global carbon emissions. We test whether prompt...
DepthCharge: A Domain-Agnostic Framework for Measuring Depth-Dependent Knowledge in Large Language Models
arXiv:2603.23514v1 Announce Type: new Abstract: Large Language Models appear competent when answering general questions but often fail when pushed into domain-specific details. No existing methodology provides an out-of-the-box solution for measuring how deeply LLMs can sustain accurate responses under adaptive...
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens
arXiv:2603.23516v1 Announce Type: new Abstract: Long-term memory is a cornerstone of human intelligence. Enabling AI to process lifetime-scale information remains a long-standing pursuit in the field. Due to the constraints of full-attention architectures, the effective context length of large language...
Visuospatial Perspective Taking in Multimodal Language Models
arXiv:2603.23510v1 Announce Type: new Abstract: As multimodal language models (MLMs) are increasingly used in social and collaborative settings, it is crucial to evaluate their perspective-taking abilities. Existing benchmarks largely rely on text-based vignettes or static scene understanding, leaving visuospatial perspective-taking...
The Diminishing Returns of Early-Exit Decoding in Modern LLMs
arXiv:2603.23701v1 Announce Type: new Abstract: In Large Language Model (LLM) inference, early-exit refers to stopping computation at an intermediate layer once the prediction is sufficiently confident, thereby reducing latency and cost. However, recent LLMs adopt improved pretraining recipes and architectures...
PoliticsBench: Benchmarking Political Values in Large Language Models with Multi-Turn Roleplay
arXiv:2603.23841v1 Announce Type: new Abstract: While Large Language Models (LLMs) are increasingly used as primary sources of information, their potential for political bias may impact their objectivity. Existing benchmarks of LLM social bias primarily evaluate gender and racial stereotypes. When...
Implicit Turn-Wise Policy Optimization for Proactive User-LLM Interaction
arXiv:2603.23550v1 Announce Type: new Abstract: Multi-turn human-AI collaboration is fundamental to deploying interactive services such as adaptive tutoring, conversational recommendation, and professional consultation. However, optimizing these interactions via reinforcement learning is hindered by the sparsity of verifiable intermediate rewards and...
PoiCGAN: A Targeted Poisoning Based on Feature-Label Joint Perturbation in Federated Learning
arXiv:2603.23574v1 Announce Type: new Abstract: Federated Learning (FL), as a popular distributed learning paradigm, has shown outstanding performance in improving computational efficiency and protecting data privacy, and is widely applied in industrial image classification. However, due to its distributed nature,...
BXRL: Behavior-Explainable Reinforcement Learning
arXiv:2603.23738v1 Announce Type: new Abstract: A major challenge of Reinforcement Learning is that agents often learn undesired behaviors that seem to defy the reward structure they were given. Explainable Reinforcement Learning (XRL) methods can answer queries such as "explain this...
Can we generate portable representations for clinical time series data using LLMs?
arXiv:2603.23987v1 Announce Type: new Abstract: Deploying clinical ML is slow and brittle: models that work at one hospital often degrade under distribution shifts at the next. In this work, we study a simple question -- can large language models (LLMs)...
LGSE: Lexically Grounded Subword Embedding Initialization for Low-Resource Language Adaptation
arXiv:2603.22629v1 Announce Type: new Abstract: Adapting pretrained language models to low-resource, morphologically rich languages remains a significant challenge. Existing vocabulary expansion methods typically rely on arbitrarily segmented subword units, resulting in fragmented lexical representations and loss of critical morphological information....
Sparse but Critical: A Token-Level Analysis of Distributional Shifts in RLVR Fine-Tuning of LLMs
arXiv:2603.22446v1 Announce Type: new Abstract: Reinforcement learning with verifiable rewards (RLVR) has significantly improved reasoning in large language models (LLMs), yet the token-level mechanisms underlying these improvements remain unclear. We present a systematic empirical study of RLVR's distributional effects organized...
Lie to Me: How Faithful Is Chain-of-Thought Reasoning in Reasoning Models?
arXiv:2603.22582v1 Announce Type: new Abstract: Chain-of-thought (CoT) reasoning has been proposed as a transparency mechanism for large language models in safety-critical deployments, yet its effectiveness depends on faithfulness (whether models accurately verbalize the factors that actually influence their outputs), a...
Functional Component Ablation Reveals Specialization Patterns in Hybrid Language Model Architectures
arXiv:2603.22473v1 Announce Type: new Abstract: Hybrid language models combining attention with state space models (SSMs) or linear attention offer improved efficiency, but whether both components are genuinely utilized remains unclear. We present a functional component ablation framework applied to two...
Analysing LLM Persona Generation and Fairness Interpretation in Polarised Geopolitical Contexts
arXiv:2603.22837v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly utilised for social simulation and persona generation, necessitating an understanding of how they represent geopolitical identities. In this paper, we analyse personas generated for Palestinian and Israeli identities by...
I Came, I Saw, I Explained: Benchmarking Multimodal LLMs on Figurative Meaning in Memes
arXiv:2603.23229v1 Announce Type: new Abstract: Internet memes represent a popular form of multimodal online communication and often use figurative elements to convey layered meaning through the combination of text and images. However, it remains largely unclear how multimodal large language...
Research on Individual Trait Clustering and Development Pathway Adaptation Based on the K-means Algorithm
arXiv:2603.22302v1 Announce Type: new Abstract: With the development of information technology, the application of artificial intelligence and machine learning in the field of education shows great potential. This study aims to explore how to utilize K-means clustering algorithm to provide...
Full waveform inversion method based on diffusion model
arXiv:2603.22307v1 Announce Type: new Abstract: Seismic full-waveform inversion is a core technology for obtaining high-resolution subsurface model parameters. However, its highly nonlinear characteristics and strong dependence on the initial model often lead to the inversion process getting trapped in local...
ST-GDance++: A Scalable Spatial-Temporal Diffusion for Long-Duration Group Choreography
arXiv:2603.22316v1 Announce Type: new Abstract: Group dance generation from music requires synchronizing multiple dancers while maintaining spatial coordination, making it highly relevant to applications such as film production, gaming, and animation. Recent group dance generation models have achieved promising generation...
Geometric Mixture-of-Experts with Curvature-Guided Adaptive Routing for Graph Representation Learning
arXiv:2603.22317v1 Announce Type: new Abstract: Graph-structured data typically exhibits complex topological heterogeneity, making it difficult to model accurately within a single Riemannian manifold. While emerging mixed-curvature methods attempt to capture such diversity, they often rely on implicit, task-driven routing that...
FAAR: Format-Aware Adaptive Rounding for NVFP4
arXiv:2603.22370v1 Announce Type: new Abstract: Deploying large language models (LLMs) on edge devices requires extremely low-bit quantization. Ultra-low precision formats such as NVFP4 offer a promising solution for reducing memory footprint and accelerating computation. However, existing quantization methods typically rely...
Doss raises $55M for AI inventory management that plugs into ERP
Doss's AI-powered inventory management system integrates with existing ERP systems. The Series B round was co-led by Madrona and Premji Invest.
The AI Scientific Community: Agentic Virtual Lab Swarms
arXiv:2603.21344v1 Announce Type: new Abstract: In this short note we propose using agentic swarms of virtual labs as a model of an AI Science Community. In this paradigm, each particle in the swarm represents a complete virtual laboratory instance, enabling...
Do LLM-Driven Agents Exhibit Engagement Mechanisms? Controlled Tests of Information Load, Descriptive Norms, and Popularity Cues
arXiv:2603.20911v1 Announce Type: new Abstract: Large language models make agent-based simulation more behaviorally expressive, but they also sharpen a basic methodological tension: fluent, human-like output is not, by itself, evidence for theory. We evaluate what an LLM-driven simulation can credibly...