Discrete Flow Matching Policy Optimization
arXiv:2604.06491v1 Announce Type: new Abstract: We introduce Discrete flow Matching policy Optimization (DoMinO), a unified framework for Reinforcement Learning (RL) fine-tuning Discrete Flow Matching (DFM) models under a broad class of policy gradient methods. Our key idea is to view...
The Illusion of Stochasticity in LLMs
arXiv:2604.06543v1 Announce Type: new Abstract: In this work, we demonstrate that reliable stochastic sampling is a fundamental yet unfulfilled requirement for Large Language Models (LLMs) operating as agents. Agentic systems are frequently required to sample from distributions, often inferred from...
How our digital devices are putting our right to privacy at risk
Law professor Andrew Guthrie Ferguson chats with Ars about his new book,Your Data Will Be Used Against You.
Optimal Rates for Pure {\varepsilon}-Differentially Private Stochastic Convex Optimization with Heavy Tails
arXiv:2604.06492v1 Announce Type: new Abstract: We study stochastic convex optimization (SCO) with heavy-tailed gradients under pure epsilon-differential privacy (DP). Instead of assuming a bound on the worst-case Lipschitz parameter of the loss, we assume only a bounded k-th moment. This...
Limits of Difficulty Scaling: Hard Samples Yield Diminishing Returns in GRPO-Tuned SLMs
arXiv:2604.06298v1 Announce Type: new Abstract: Recent alignment work on Large Language Models (LLMs) suggests preference optimization can improve reasoning by shifting probability mass toward better solutions. We test this claim in a resource-constrained setting by applying GRPO with LoRA to...
LLM-based Schema-Guided Extraction and Validation of Missing-Person Intelligence from Heterogeneous Data Sources
arXiv:2604.06571v1 Announce Type: new Abstract: Missing-person and child-safety investigations rely on heterogeneous case documents, including structured forms, bulletin-style posters, and narrative web profiles. Variations in layout, terminology, and data quality impede rapid triage, large-scale analysis, and search-planning workflows. This paper...
Consistency-Guided Decoding with Proof-Driven Disambiguation for Three-Way Logical Question Answering
arXiv:2604.06196v1 Announce Type: new Abstract: Three-way logical question answering (QA) assigns $True/False/Unknown$ to a hypothesis $H$ given a premise set $S$. While modern large language models (LLMs) can be accurate on isolated examples, we identify two recurring failure modes in...
Stochastic Gradient Descent in the Saddle-to-Saddle Regime of Deep Linear Networks
arXiv:2604.06366v1 Announce Type: new Abstract: Deep linear networks (DLNs) are used as an analytically tractable model of the training dynamics of deep neural networks. While gradient descent in DLNs is known to exhibit saddle-to-saddle dynamics, the impact of stochastic gradient...
Final 3 days to save up to $500 on your TechCrunch Disrupt 2026 pass
Save up to $500 on your TechCrunch Disrupt 2026 pass until April 10, 11:59 p.m. PT. Secure your spot at the center of the tech ecosystem. Register today.
Improving Robustness In Sparse Autoencoders via Masked Regularization
arXiv:2604.06495v1 Announce Type: new Abstract: Sparse autoencoders (SAEs) are widely used in mechanistic interpretability to project LLM activations onto sparse latent spaces. However, sparsity alone is an imperfect proxy for interpretability, and current training objectives often result in brittle latent...
AE-ViT: Stable Long-Horizon Parametric Partial Differential Equations Modeling
arXiv:2604.06475v1 Announce Type: new Abstract: Deep Learning Reduced Order Models (ROMs) are becoming increasingly popular as surrogate models for parametric partial differential equations (PDEs) due to their ability to handle high-dimensional data, approximate highly nonlinear mappings, and utilize GPUs. Existing...
Bi-Level Optimization for Single Domain Generalization
arXiv:2604.06349v1 Announce Type: new Abstract: Generalizing from a single labeled source domain to unseen target domains, without access to any target data during training, remains a fundamental challenge in robust machine learning. We address this underexplored setting, known as Single...
FLeX: Fourier-based Low-rank EXpansion for multilingual transfer
arXiv:2604.06253v1 Announce Type: new Abstract: Cross-lingual code generation is critical in enterprise environments where multiple programming languages coexist. However, fine-tuning large language models (LLMs) individually for each language is computationally prohibitive. This paper investigates whether parameter-efficient fine-tuning methods and optimizer...
Scientific Knowledge-driven Decoding Constraints Improving the Reliability of LLMs
arXiv:2604.06603v1 Announce Type: new Abstract: Large language models (LLMs) have shown strong knowledge reserves and task-solving capabilities, but still face the challenge of severe hallucination, hindering their practical application. Though scientific theories and rules can efficiently direct the behaviors of...
To beat Altman in court, Musk offers to give all damages to OpenAI nonprofit
Musk won’t seek a “single dollar” in OpenAI suit after asking to pocket up to $134 billion.
Time-Series Classification with Multivariate Statistical Dependence Features
arXiv:2604.06537v1 Announce Type: new Abstract: In this paper, we propose a novel framework for non-stationary time-series analysis that replaces conventional correlation-based statistics with direct estimation of statistical dependence in the normalized joint density of input and target signals, the cross...
Learning to Interrupt in Language-based Multi-agent Communication
arXiv:2604.06452v1 Announce Type: new Abstract: Multi-agent systems using large language models (LLMs) have demonstrated impressive capabilities across various domains. However, current agent communication suffers from verbose output that overload context and increase computational costs. Although existing approaches focus on compressing...
TalkLoRA: Communication-Aware Mixture of Low-Rank Adaptation for Large Language Models
arXiv:2604.06291v1 Announce Type: new Abstract: Low-Rank Adaptation (LoRA) enables parameter-efficient fine-tuning of Large Language Models (LLMs), and recent Mixture-of-Experts (MoE) extensions further enhance flexibility by dynamically combining multiple LoRA experts. However, existing MoE-augmented LoRA methods assume that experts operate independently,...
Tubi is the first streamer to launch a native app within ChatGPT
Tubi becomes the first streaming service to offer an app integration within ChatGPT, the AI chatbot that millions of users turn to for answers.
OpenAI releases a new safety blueprint to address the rise in child sexual exploitation
OpenAI's new Child Safety Blueprint aims to tackle the alarming rise in child sexual exploitation linked to advancements in AI.
Towards Accurate and Calibrated Classification: Regularizing Cross-Entropy From A Generative Perspective
arXiv:2604.06689v1 Announce Type: new Abstract: Accurate classification requires not only high predictive accuracy but also well-calibrated confidence estimates. Yet, modern deep neural networks (DNNs) are often overconfident, primarily due to overfitting on the negative log-likelihood (NLL). While focal loss variants...
SHAPE: Stage-aware Hierarchical Advantage via Potential Estimation for LLM Reasoning
arXiv:2604.06636v1 Announce Type: new Abstract: Process supervision has emerged as a promising approach for enhancing LLM reasoning, yet existing methods fail to distinguish meaningful progress from mere verbosity, leading to limited reasoning capabilities and unresolved token inefficiency. To address this,...
Inference-Time Code Selection via Symbolic Equivalence Partitioning
arXiv:2604.06485v1 Announce Type: new Abstract: "Best-of-N" selection is a popular inference-time scaling method for code generation using Large Language Models (LLMs). However, to reliably identify correct solutions, existing methods often depend on expensive or stochastic external verifiers. In this paper,...
Distributed Interpretability and Control for Large Language Models
arXiv:2604.06483v1 Announce Type: new Abstract: Large language models that require multiple GPU cards to host are usually the most capable models. It is necessary to understand and steer these models, but the current technologies do not support the interpretability and...
The Depth Ceiling: On the Limits of Large Language Models in Discovering Latent Planning
arXiv:2604.06427v1 Announce Type: new Abstract: The viability of chain-of-thought (CoT) monitoring hinges on models being unable to reason effectively in their latent representations. Yet little is known about the limits of such latent reasoning in LLMs. We test these limits...
Toward a universal foundation model for graph-structured data
arXiv:2604.06391v1 Announce Type: new Abstract: Graphs are a central representation in biomedical research, capturing molecular interaction networks, gene regulatory circuits, cell--cell communication maps, and knowledge graphs. Despite their importance, currently there is not a broadly reusable foundation model available for...
Asymptotic-Preserving Neural Networks for Viscoelastic Parameter Identification in Multiscale Blood Flow Modeling
arXiv:2604.06287v1 Announce Type: new Abstract: Mathematical models and numerical simulations offer a non-invasive way to explore cardiovascular phenomena, providing access to quantities that cannot be measured directly. In this study, we start with a one-dimensional multiscale blood flow model that...
RAGEN-2: Reasoning Collapse in Agentic RL
arXiv:2604.06268v1 Announce Type: new Abstract: RL training of multi-turn LLM agents is inherently unstable, and reasoning quality directly determines task performance. Entropy is widely used to track reasoning stability. However, entropy only measures diversity within the same input, and cannot...
DiffuMask: Diffusion Language Model for Token-level Prompt Pruning
arXiv:2604.06627v1 Announce Type: new Abstract: In-Context Learning and Chain-of-Thought prompting improve reasoning in large language models (LLMs). These typically come at the cost of longer, more expensive prompts that may contain redundant information. Prompt compression based on pruning offers a...
Does a Global Perspective Help Prune Sparse MoEs Elegantly?
arXiv:2604.06542v1 Announce Type: new Abstract: Empirical scaling laws for language models have encouraged the development of ever-larger LLMs, despite their growing computational and memory costs. Sparse Mixture-of-Experts (MoEs) offer a promising alternative by activating only a subset of experts per...