Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants
arXiv:2604.00842v1 Announce Type: new Abstract: Proactive agents that anticipate user needs and autonomously execute tasks hold great promise as digital assistants, yet the lack of realistic user simulation frameworks hinders their development. Existing approaches model apps as flat tool-calling APIs,...
Amazon is trying to buy Globalstar to compete with SpaceX's Starlink
Amazon wants in on the low-Earth orbit Internet action.
LinearARD: Linear-Memory Attention Distillation for RoPE Restoration
arXiv:2604.00004v1 Announce Type: cross Abstract: The extension of context windows in Large Language Models is typically facilitated by scaling positional encodings followed by lightweight Continual Pre-Training (CPT). While effective for processing long sequences, this paradigm often disrupts original model capabilities,...
A Retrospective on the ICLR 2026 Review Process
Dual-Attention Based 3D Channel Estimation
arXiv:2604.01769v1 Announce Type: new Abstract: For multi-input and multi-output (MIMO) channels, the optimal channel estimation (CE) based on linear minimum mean square error (LMMSE) requires three-dimensional (3D) filtering. However, the complexity is often prohibitive due to large matrix dimensions. Suboptimal...
MSA-Thinker: Discrimination-Calibration Reasoning with Hint-Guided Reinforcement Learning for Multimodal Sentiment Analysis
arXiv:2604.00013v1 Announce Type: cross Abstract: Multimodal sentiment analysis aims to understand human emotions by integrating textual, auditory, and visual modalities. Although Multimodal Large Language Models (MLLMs) have achieved state-of-the-art performance via supervised fine-tuning (SFT), their end-to-end "black-box" nature limits interpretability....
PsychAgent: An Experience-Driven Lifelong Learning Agent for Self-Evolving Psychological Counselor
arXiv:2604.00931v2 Announce Type: new Abstract: Existing methods for AI psychological counselors predominantly rely on supervised fine-tuning using static dialogue datasets. However, this contrasts with human experts, who continuously refine their proficiency through clinical practice and accumulated experience. To bridge this...
How Do Language Models Process Ethical Instructions? Deliberation, Consistency, and Other-Recognition Across Four Models
arXiv:2604.00021v1 Announce Type: cross Abstract: Alignment safety research assumes that ethical instructions improve model behavior, but how language models internally process such instructions remains unknown. We conducted over 600 multi-agent simulations across four models (Llama 3.3 70B, GPT-4o mini, Qwen3-Next-80B-A3B,...
Are they human? Detecting large language models by probing human memory constraints
arXiv:2604.00016v1 Announce Type: cross Abstract: The validity of online behavioral research relies on study participants being human rather than machine. In the past, it was possible to detect machines by posing simple challenges that were easily solved by humans but...
Beyond Logit Adjustment: A Residual Decomposition Framework for Long-Tailed Reranking
arXiv:2604.01506v1 Announce Type: new Abstract: Long-tailed classification, where a small number of frequent classes dominate many rare ones, remains challenging because models systematically favor frequent classes at inference time. Existing post-hoc methods such as logit adjustment address this by adding...
Detecting Abnormal User Feedback Patterns through Temporal Sentiment Aggregation
arXiv:2604.00020v1 Announce Type: new Abstract: In many real-world applications, such as customer feedback monitoring, brand reputation management, and product health tracking, understanding the temporal dynamics of user sentiment is crucial for early detection of anomalous events such as malicious review...
Think Twice Before You Write -- an Entropy-based Decoding Strategy to Enhance LLM Reasoning
arXiv:2604.00018v1 Announce Type: cross Abstract: Decoding strategies play a central role in shaping the reasoning ability of large language models (LLMs). Traditional methods such as greedy decoding and beam search often suffer from error propagation, while sampling-based approaches introduce randomness...
Detecting Multi-Agent Collusion Through Multi-Agent Interpretability
arXiv:2604.01151v1 Announce Type: new Abstract: As LLM agents are increasingly deployed in multi-agent systems, they introduce risks of covert coordination that may evade standard forms of human oversight. While linear probes on model activations have shown promise for detecting deception...
CuTeGen: An LLM-Based Agentic Framework for Generation and Optimization of High-Performance GPU Kernels using CuTe
arXiv:2604.01489v1 Announce Type: new Abstract: High-performance GPU kernels are critical to modern machine learning systems, yet developing efficient implementations remains a challenging, expert-driven process due to the tight coupling between algorithmic structure, memory hierarchy usage, and hardware-specific optimizations. Recent work...
When Reward Hacking Rebounds: Understanding and Mitigating It with Representation-Level Signals
arXiv:2604.01476v1 Announce Type: new Abstract: Reinforcement learning for LLMs is vulnerable to reward hacking, where models exploit shortcuts to maximize reward without solving the intended task. We systematically study this phenomenon in coding tasks using an environment-manipulation setting, where models...
Therefore I am. I Think
arXiv:2604.01202v2 Announce Type: new Abstract: We consider the question: when a large language reasoning model makes a choice, did it think first and then decide to, or decide first and then think? In this paper, we present evidence that detectable,...
SECURE: Stable Early Collision Understanding via Robust Embeddings in Autonomous Driving
arXiv:2604.01337v1 Announce Type: new Abstract: While deep learning has significantly advanced accident anticipation, the robustness of these safety-critical systems against real-world perturbations remains a major challenge. We reveal that state-of-the-art models like CRASH, despite their high performance, exhibit significant instability...
Large Language Models in the Abuse Detection Pipeline
arXiv:2604.00323v1 Announce Type: new Abstract: Online abuse has grown increasingly complex, spanning toxic language, harassment, manipulation, and fraudulent behavior. Traditional machine-learning approaches dependent on static classifiers and labor-intensive labeling struggle to keep pace with evolving threat patterns and nuanced policy...
An Online Machine Learning Multi-resolution Optimization Framework for Energy System Design Limit of Performance Analysis
arXiv:2604.01308v1 Announce Type: new Abstract: Designing reliable integrated energy systems for industrial processes requires optimization and verification models across multiple fidelities, from architecture-level sizing to high-fidelity dynamic operation. However, model mismatch across fidelities obscures the sources of performance loss and...
Soft MPCritic: Amortized Model Predictive Value Iteration
arXiv:2604.01477v1 Announce Type: new Abstract: Reinforcement learning (RL) and model predictive control (MPC) offer complementary strengths, yet combining them at scale remains computationally challenging. We propose soft MPCritic, an RL-MPC framework that learns in (soft) value space while using sample-based...
Retrospective on PAT x ICML 2026 AI Paper Assistant Program
Improving Latent Generalization Using Test-time Compute
arXiv:2604.01430v1 Announce Type: new Abstract: Language Models (LMs) exhibit two distinct mechanisms for knowledge acquisition: in-weights learning (i.e., encoding information within the model weights) and in-context learning (ICL). Although these two modes offer complementary strengths, in-weights learning frequently struggles to...
Nomadic raises $8.4 million to wrangle the data pouring off autonomous vehicles
The company turns footage from robots into structured, searchable datasets with a deep learning model.
Learning ECG Image Representations via Dual Physiological-Aware Alignments
arXiv:2604.01526v1 Announce Type: new Abstract: Electrocardiograms (ECGs) are among the most widely used diagnostic tools for cardiovascular diseases, and a large amount of ECG data worldwide appears only in image form. However, most existing automated ECG analysis methods rely on...
Massively Parallel Exact Inference for Hawkes Processes
arXiv:2604.01342v1 Announce Type: new Abstract: Multivariate Hawkes processes are a widely used class of self-exciting point processes, but maximum likelihood estimation naively scales as $O(N^2)$ in the number of events. The canonical linear exponential Hawkes process admits a faster $O(N)$...
Beyond Symbolic Solving: Multi Chain-of-Thought Voting for Geometric Reasoning in Large Language Models
arXiv:2604.00890v1 Announce Type: new Abstract: Geometric Problem Solving (GPS) remains at the heart of enhancing mathematical reasoning in large language models because it requires the combination of diagrammatic understanding, symbolic manipulation and logical inference. In existing literature, researchers have chiefly...
UQ-SHRED: uncertainty quantification of shallow recurrent decoder networks for sparse sensing via engression
arXiv:2604.01305v1 Announce Type: new Abstract: Reconstructing high-dimensional spatiotemporal fields from sparse sensor measurements is critical in a wide range of scientific applications. The SHallow REcurrent Decoder (SHRED) architecture is a recent state-of-the-art architecture that reconstructs high-quality spatial domain from hyper-sparse...
FourierMoE: Fourier Mixture-of-Experts Adaptation of Large Language Models
arXiv:2604.01762v1 Announce Type: new Abstract: Parameter-efficient fine-tuning (PEFT) has emerged as a crucial paradigm for adapting large language models (LLMs) under constrained computational budgets. However, standard PEFT methods often struggle in multi-task fine-tuning settings, where diverse optimization objectives induce task...
Residuals-based Offline Reinforcement Learning
arXiv:2604.01378v1 Announce Type: new Abstract: Offline reinforcement learning (RL) has received increasing attention for learning policies from previously collected data without interaction with the real environment, which is particularly important in high-stakes applications. While a growing body of work has...