An Initial Exploration of Contrastive Prompt Tuning to Generate Energy-Efficient Code
arXiv:2604.02352v1 Announce Type: cross Abstract: Although LLMs are capable of generating functionally correct code, they also tend to produce less energy-efficient code in comparison to human-written solutions. As these inefficiencies lead to higher computational overhead, they are in direct conflict...
EMS: Multi-Agent Voting via Efficient Majority-then-Stopping
arXiv:2604.02863v1 Announce Type: new Abstract: Majority voting is the standard for aggregating multi-agent responses into a final decision. However, traditional methods typically require all agents to complete their reasoning before aggregation begins, leading to significant computational overhead, as many responses...
AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents
arXiv:2604.02947v1 Announce Type: new Abstract: Computer-use agents extend language models from text generation to persistent action over tools, files, and execution environments. Unlike chat systems, they maintain state across interactions and translate intermediate outputs into concrete actions. This creates a...
Communication-free Sampling and 4D Hybrid Parallelism for Scalable Mini-batch GNN Training
arXiv:2604.02651v1 Announce Type: new Abstract: Graph neural networks (GNNs) are widely used for learning on graph datasets derived from various real-world scenarios. Learning from extremely large graphs requires distributed training, and mini-batching with sampling is a popular approach for parallelizing...
From Broad Exploration to Stable Synthesis: Entropy-Guided Optimization for Autoregressive Image Generation
arXiv:2604.02355v1 Announce Type: new Abstract: Combining Chain-of-Thought (CoT) with Reinforcement Learning (RL) improves text-to-image (T2I) generation, yet the underlying interaction between CoT's exploration and RL's optimization remains unclear. We present a systematic entropy-based analysis that yields three key insights: (1)...
Student-in-the-Loop Chain-of-Thought Distillation via Generation-Time Selection
arXiv:2604.02819v1 Announce Type: new Abstract: Large reasoning models achieve strong performance on complex tasks through long chain-of-thought (CoT) trajectories, but directly transferring such reasoning processes to smaller models remains challenging. A key difficulty is that not all teacher-generated reasoning trajectories...
FoE: Forest of Errors Makes the First Solution the Best in Large Reasoning Models
arXiv:2604.02967v1 Announce Type: new Abstract: Recent Large Reasoning Models (LRMs) like DeepSeek-R1 have demonstrated remarkable success in complex reasoning tasks, exhibiting human-like patterns in exploring multiple alternative solutions. Upon closer inspection, however, we uncover a surprising phenomenon: The First is...
Improving Role Consistency in Multi-Agent Collaboration via Quantitative Role Clarity
arXiv:2604.02770v1 Announce Type: new Abstract: In large language model (LLM)-driven multi-agent systems, disobey role specification (failure to adhere to the defined responsibilities and constraints of an assigned role, potentially leading to an agent behaving like another) is a major failure...
Learning the Signature of Memorization in Autoregressive Language Models
arXiv:2604.03199v1 Announce Type: new Abstract: All prior membership inference attacks for fine-tuned language models use hand-crafted heuristics (e.g., loss thresholding, Min-K\%, reference calibration), each bounded by the designer's intuition. We introduce the first transferable learned attack, enabled by the observation...
What oral argument told us in the birthright citizenship case
Empirical SCOTUS is a recurring series by Adam Feldman that looks at Supreme Court data, primarily in the form of opinions and oral arguments, to provide insights into the justices’ decision making and […]The postWhat oral argument told us in...
OpenAI executive shuffle includes new role for COO Brad Lightcap to lead ‘special projects’
In addition to Lightcap's new role, OpenAI CMO Kate Rouch will be stepping away from the company to focus on cancer recovery, with a plan to return when her health allows.
The Enumerated-Rights Reading of the Privileges or Immunities Clause: A Response to Barnett and Bernick
ARTICLE The Enumerated-Rights Reading of the Privileges or Immunities Clause: A Response to Barnett and Bernick Kurt T. Lash* In 1871, John Bingham explained the meaning of the Fourteenth Amendment’s Privileges or Immunities Clause—a clause Bingham himself drafted and had...
The Privileges or Immunities Clause, Abridged: A Critique of Kurt Lash on the Fourteenth Amendment
ARTICLE The Privileges or Immunities Clause, Abridged: A Critique of Kurt Lash on the Fourteenth Amendment Randy E. Barnett* & Evan D. Bernick** The Privileges or Immunities Clause of the Fourteenth Amendment reads: “No State shall make or enforce any...
What’s new for the Position Paper Track at NeurIPS 2026
Therefore I am. I Think
arXiv:2604.01202v2 Announce Type: new Abstract: We consider the question: when a large language reasoning model makes a choice, did it think first and then decide to, or decide first and then think? In this paper, we present evidence that detectable,...
Supreme Court appears likely to side against Trump on birthright citizenship
Updated on April 1 at 10:10 p.m. On Jan. 20, 2025, President Donald Trump signed an executive order that would end birthright citizenship – the guarantee of U.S. citizenship to […]The postSupreme Court appears likely to side against Trump on...
In harmony with gpt-oss
arXiv:2604.00362v1 Announce Type: new Abstract: No one has independently reproduced OpenAI's published scores for gpt-oss-20b with tools, because the original paper discloses neither the tools nor the agent harness. We reverse-engineered the model's in-distribution tools: when prompted without tool definitions,...
Optimizing EEG Graph Structure for Seizure Detection: An Information Bottleneck and Self-Supervised Learning Approach
arXiv:2604.01595v1 Announce Type: new Abstract: Seizure detection from EEG signals is highly challenging due to complex spatiotemporal dynamics and extreme inter-patient variability. To model them, recent methods construct dynamic graphs via statistical correlations, predefined similarity measures, or implicit learning, yet...
Criterion Validity of LLM-as-Judge for Business Outcomes in Conversational Commerce
arXiv:2604.00022v1 Announce Type: cross Abstract: Multi-dimensional rubric-based dialogue evaluation is widely used to assess conversational AI, yet its criterion validity -- whether quality scores are associated with the downstream outcomes they are meant to serve -- remains largely untested. We...
Collaborative AI Agents and Critics for Fault Detection and Cause Analysis in Network Telemetry
arXiv:2604.00319v1 Announce Type: new Abstract: We develop algorithms for collaborative control of AI agents and critics in a multi-actor, multi-critic federated multi-agent system. Each AI agent and critic has access to classical machine learning or generative AI foundation models. The...
Birthright citizenship live blog for Wednesday, April 1
On Wednesday, April 1, we will be live blogging as the court hears argument in Trump v. Barbara, on the constitutionality of President Donald Trump’s executive order on birthright citizenship. […]The postBirthright citizenship live blog for Wednesday, April 1appeared first...
LLM Essay Scoring Under Holistic and Analytic Rubrics: Prompt Effects and Bias
arXiv:2604.00259v1 Announce Type: new Abstract: Despite growing interest in using Large Language Models (LLMs) for educational assessment, it remains unclear how closely they align with human scoring. We present a systematic evaluation of instruction-tuned LLMs across three open essay-scoring datasets...
Trump attends birthright citizenship argument
Updated on April 1 at 7:48 p.m. As soon as President Donald Trump last evening mentioned attending argument in the birthright citizenship case in Trump v. Barbara today, some Supreme […]The postTrump attends birthright citizenship argumentappeared first onSCOTUSblog.
MiCA Learns More Knowledge Than LoRA and Full Fine-Tuning
arXiv:2604.01694v1 Announce Type: new Abstract: Minor Component Adaptation (MiCA) is a novel parameter-efficient fine-tuning method for large language models that focuses on adapting underutilized subspaces of model representations. Unlike conventional methods such as Low-Rank Adaptation (LoRA), which target dominant subspaces,...
Can Large Language Models Self-Correct in Medical Question Answering? An Exploratory Study
arXiv:2604.00261v2 Announce Type: new Abstract: Large language models (LLMs) have achieved strong performance on medical question answering (medical QA), and chain-of-thought (CoT) prompting has further improved results by eliciting explicit intermediate reasoning; meanwhile, self-reflective (self-corrective) prompting has been widely claimed...
Agent Q-Mix: Selecting the Right Action for LLM Multi-Agent Systems through Reinforcement Learning
arXiv:2604.00344v1 Announce Type: new Abstract: Large Language Models (LLMs) have shown remarkable performance in completing various tasks. However, solving complex problems often requires the coordination of multiple agents, raising a fundamental question: how to effectively select and interconnect these agents....
Frege in the Flesh: Biolinguistics and the Neural Enforcement of Syntactic Structures
arXiv:2604.00291v1 Announce Type: new Abstract: Biolinguistics is the interdisciplinary scientific study of the biological foundations, evolution, and genetic basis of human language. It treats language as an innate biological organ or faculty of the mind, rather than a cultural tool,...
TRIMS: Trajectory-Ranked Instruction Masked Supervision for Diffusion Language Models
arXiv:2604.00666v1 Announce Type: new Abstract: Diffusion language models (DLMs) offer a promising path toward low-latency generation through parallel decoding, but their practical efficiency depends heavily on the decoding trajectory. In practice, this advantage often fails to fully materialize because standard...
Improvisational Games as a Benchmark for Social Intelligence of AI Agents: The Case of Connections
arXiv:2604.00284v1 Announce Type: new Abstract: We formally introduce a improvisational wordplay game called Connections to explore reasoning capabilities of AI agents. Playing Connections combines skills in knowledge retrieval, summarization and awareness of cognitive states of other agents. We show how...
DISCO-TAB: A Hierarchical Reinforcement Learning Framework for Privacy-Preserving Synthesis of Complex Clinical Data
arXiv:2604.01481v1 Announce Type: new Abstract: The development of robust clinical decision support systems is frequently impeded by the scarcity of high-fidelity, privacy-preserving biomedical data. While Generative Large Language Models (LLMs) offer a promising avenue for synthetic data generation, they often...