To beat Altman in court, Musk offers to give all damages to OpenAI nonprofit
Musk won’t seek a “single dollar” in OpenAI suit after asking to pocket up to $134 billion.
Time-Series Classification with Multivariate Statistical Dependence Features
arXiv:2604.06537v1 Announce Type: new Abstract: In this paper, we propose a novel framework for non-stationary time-series analysis that replaces conventional correlation-based statistics with direct estimation of statistical dependence in the normalized joint density of input and target signals, the cross...
Learning to Interrupt in Language-based Multi-agent Communication
arXiv:2604.06452v1 Announce Type: new Abstract: Multi-agent systems using large language models (LLMs) have demonstrated impressive capabilities across various domains. However, current agent communication suffers from verbose output that overload context and increase computational costs. Although existing approaches focus on compressing...
TalkLoRA: Communication-Aware Mixture of Low-Rank Adaptation for Large Language Models
arXiv:2604.06291v1 Announce Type: new Abstract: Low-Rank Adaptation (LoRA) enables parameter-efficient fine-tuning of Large Language Models (LLMs), and recent Mixture-of-Experts (MoE) extensions further enhance flexibility by dynamically combining multiple LoRA experts. However, existing MoE-augmented LoRA methods assume that experts operate independently,...
Tubi is the first streamer to launch a native app within ChatGPT
Tubi becomes the first streaming service to offer an app integration within ChatGPT, the AI chatbot that millions of users turn to for answers.
OpenAI releases a new safety blueprint to address the rise in child sexual exploitation
OpenAI's new Child Safety Blueprint aims to tackle the alarming rise in child sexual exploitation linked to advancements in AI.
Towards Accurate and Calibrated Classification: Regularizing Cross-Entropy From A Generative Perspective
arXiv:2604.06689v1 Announce Type: new Abstract: Accurate classification requires not only high predictive accuracy but also well-calibrated confidence estimates. Yet, modern deep neural networks (DNNs) are often overconfident, primarily due to overfitting on the negative log-likelihood (NLL). While focal loss variants...
SHAPE: Stage-aware Hierarchical Advantage via Potential Estimation for LLM Reasoning
arXiv:2604.06636v1 Announce Type: new Abstract: Process supervision has emerged as a promising approach for enhancing LLM reasoning, yet existing methods fail to distinguish meaningful progress from mere verbosity, leading to limited reasoning capabilities and unresolved token inefficiency. To address this,...
Inference-Time Code Selection via Symbolic Equivalence Partitioning
arXiv:2604.06485v1 Announce Type: new Abstract: "Best-of-N" selection is a popular inference-time scaling method for code generation using Large Language Models (LLMs). However, to reliably identify correct solutions, existing methods often depend on expensive or stochastic external verifiers. In this paper,...
Distributed Interpretability and Control for Large Language Models
arXiv:2604.06483v1 Announce Type: new Abstract: Large language models that require multiple GPU cards to host are usually the most capable models. It is necessary to understand and steer these models, but the current technologies do not support the interpretability and...
The Depth Ceiling: On the Limits of Large Language Models in Discovering Latent Planning
arXiv:2604.06427v1 Announce Type: new Abstract: The viability of chain-of-thought (CoT) monitoring hinges on models being unable to reason effectively in their latent representations. Yet little is known about the limits of such latent reasoning in LLMs. We test these limits...
Toward a universal foundation model for graph-structured data
arXiv:2604.06391v1 Announce Type: new Abstract: Graphs are a central representation in biomedical research, capturing molecular interaction networks, gene regulatory circuits, cell--cell communication maps, and knowledge graphs. Despite their importance, currently there is not a broadly reusable foundation model available for...
Asymptotic-Preserving Neural Networks for Viscoelastic Parameter Identification in Multiscale Blood Flow Modeling
arXiv:2604.06287v1 Announce Type: new Abstract: Mathematical models and numerical simulations offer a non-invasive way to explore cardiovascular phenomena, providing access to quantities that cannot be measured directly. In this study, we start with a one-dimensional multiscale blood flow model that...
RAGEN-2: Reasoning Collapse in Agentic RL
arXiv:2604.06268v1 Announce Type: new Abstract: RL training of multi-turn LLM agents is inherently unstable, and reasoning quality directly determines task performance. Entropy is widely used to track reasoning stability. However, entropy only measures diversity within the same input, and cannot...
DiffuMask: Diffusion Language Model for Token-level Prompt Pruning
arXiv:2604.06627v1 Announce Type: new Abstract: In-Context Learning and Chain-of-Thought prompting improve reasoning in large language models (LLMs). These typically come at the cost of longer, more expensive prompts that may contain redundant information. Prompt compression based on pruning offers a...
Scientific Knowledge-driven Decoding Constraints Improving the Reliability of LLMs
arXiv:2604.06603v1 Announce Type: new Abstract: Large language models (LLMs) have shown strong knowledge reserves and task-solving capabilities, but still face the challenge of severe hallucination, hindering their practical application. Though scientific theories and rules can efficiently direct the behaviors of...
State-of-the-Art Arabic Language Modeling with Sparse MoE Fine-Tuning and Chain-of-Thought Distillation
arXiv:2604.06421v1 Announce Type: new Abstract: This paper introduces Arabic-DeepSeek-R1, an application-driven open-source Arabic LLM that leverages a sparse MoE backbone to address the digital equity gap for under-represented languages, and establishes a new SOTA across the entire Open Arabic LLM...
Attention Flows: Tracing LLM Conceptual Engagement via Story Summaries
arXiv:2604.06416v1 Announce Type: new Abstract: Although LLM context lengths have grown, there is evidence that their ability to integrate information across long-form texts has not kept pace. We evaluate one such understanding task: generating summaries of novels. When human authors...
ART: Attention Replacement Technique to Improve Factuality in LLMs
arXiv:2604.06393v1 Announce Type: new Abstract: Hallucination in large language models (LLMs) continues to be a significant issue, particularly in tasks like question answering, where models often generate plausible yet incorrect or irrelevant information. Although various methods have been proposed to...
Application-Driven Pedagogical Knowledge Optimization of Open-Source LLMs via Reinforcement Learning and Supervised Fine-Tuning
arXiv:2604.06385v1 Announce Type: new Abstract: We present an innovative multi-stage optimization strategy combining reinforcement learning (RL) and supervised fine-tuning (SFT) to enhance the pedagogical knowledge of large language models (LLMs), as illustrated by EduQwen 32B-RL1, EduQwen 32B-SFT, and an optional...
Blending Human and LLM Expertise to Detect Hallucinations and Omissions in Mental Health Chatbot Responses
arXiv:2604.06216v1 Announce Type: new Abstract: As LLM-powered chatbots are increasingly deployed in mental health services, detecting hallucinations and omissions has become critical for user safety. However, state-of-the-art LLM-as-a-judge methods often fail in high-risk healthcare contexts, where subtle errors can have...
Illocutionary Explanation Planning for Source-Faithful Explanations in Retrieval-Augmented Language Models
arXiv:2604.06211v1 Announce Type: new Abstract: Natural language explanations produced by large language models (LLMs) are often persuasive, but not necessarily scrutable: users cannot easily verify whether the claims in an explanation are supported by evidence. In XAI, this motivates a...
TelcoAgent-Bench: A Multilingual Benchmark for Telecom AI Agents
arXiv:2604.06209v1 Announce Type: new Abstract: The integration of large language model (LLM) agents into telecom networks introduces new challenges, related to intent recognition, tool execution, and resolution generation, while taking into consideration different operational constraints. In this paper, we introduce...
Extracting Breast Cancer Phenotypes from Clinical Notes: Comparing LLMs with Classical Ontology Methods
arXiv:2604.06208v1 Announce Type: new Abstract: A significant amount of data held in Oncology Electronic Medical Records (EMRs) is contained in unstructured provider notes -- including but not limited to the chemotherapy (or cancer treatment) outcome, different biomarkers, the tumor's location,...
Atlassian launches visual AI tools and third-party agents in Confluence
Confluence users can now create visual assets within the software in addition to new third-party agents working with Lovable, Replit, and Gamma.
Prune-Quantize-Distill: An Ordered Pipeline for Efficient Neural Network Compression
arXiv:2604.04988v1 Announce Type: new Abstract: Modern deployment often requires trading accuracy for efficiency under tight CPU and memory constraints, yet common compression proxies such as parameter count or FLOPs do not reliably predict wall-clock inference time. In particular, unstructured sparsity...
Cross-fitted Proximal Learning for Model-Based Reinforcement Learning
arXiv:2604.05185v1 Announce Type: new Abstract: Model-based reinforcement learning is attractive for sequential decision-making because it explicitly estimates reward and transition models and then supports planning through simulated rollouts. In offline settings with hidden confounding, however, models learned directly from observational...
PCA-Driven Adaptive Sensor Triage for Edge AI Inference
arXiv:2604.05045v1 Announce Type: new Abstract: Multi-channel sensor networks in industrial IoT often exceed available bandwidth. We propose PCA-Triage, a streaming algorithm that converts incremental PCA loadings into proportional per-channel sampling rates under a bandwidth budget. PCA-Triage runs in O(wdk) time...
Enhancing sample efficiency in reinforcement-learning-based flow control: replacing the critic with an adaptive reduced-order model
arXiv:2604.04986v1 Announce Type: new Abstract: Model-free deep reinforcement learning (DRL) methods suffer from poor sample efficiency. To overcome this limitation, this work introduces an adaptive reduced-order-model (ROM)-based reinforcement learning framework for active flow control. In contrast to conventional actor--critic architectures,...
Expectation Maximization (EM) Converges for General Agnostic Mixtures
arXiv:2604.05842v1 Announce Type: new Abstract: Mixture of linear regression is well studied in statistics and machine learning, where the data points are generated probabilistically using $k$ linear models. Algorithms like Expectation Maximization (EM) may be used to recover the ground...