DRAFT: Task Decoupled Latent Reasoning for Agent Safety
arXiv:2604.03242v1 Announce Type: new Abstract: The advent of tool-using LLM agents shifts safety monitoring from output moderation to auditing long, noisy interaction trajectories, where risk-critical evidence is sparse-making standard binary supervision poorly suited for credit assignment. To address this, we...
Algebraic Diversity: Group-Theoretic Spectral Estimation from Single Observations
arXiv:2604.03634v1 Announce Type: new Abstract: We prove that temporal averaging over multiple observations can be replaced by algebraic group action on a single observation for second-order statistical estimation. A General Replacement Theorem establishes conditions under which a group-averaged estimator from...
Announcing the ICML 2026 Workshops and Affinity Workshops
Solar-VLM: Multimodal Vision-Language Models for Augmented Solar Power Forecasting
arXiv:2604.04145v1 Announce Type: new Abstract: Photovoltaic (PV) power forecasting plays a critical role in power system dispatch and market participation. Because PV generation is highly sensitive to weather conditions and cloud motion, accurate forecasting requires effective modeling of complex spatiotemporal...
The Format Tax
arXiv:2604.03616v1 Announce Type: new Abstract: Asking a large language model to respond in JSON should be a formatting choice, not a capability tax. Yet we find that structured output requirements -- JSON, XML, LaTeX, Markdown -- substantially degrade reasoning and...
Collapse-Free Prototype Readout Layer for Transformer Encoders
arXiv:2604.03850v1 Announce Type: new Abstract: DDCL-Attention is a prototype-based readout layer for transformer encoders that replaces simple pooling methods, such as mean pooling or class tokens, with a learned compression mechanism. It uses a small set of global prototype vectors...
SKILLFOUNDRY: Building Self-Evolving Agent Skill Libraries from Heterogeneous Scientific Resources
arXiv:2604.03964v1 Announce Type: new Abstract: Modern scientific ecosystems are rich in procedural knowledge across repositories, APIs, scripts, notebooks, documentation, databases, and papers, yet much of this knowledge remains fragmented across heterogeneous artifacts that agents cannot readily operationalize. This gap between...
ACES: Who Tests the Tests? Leave-One-Out AUC Consistency for Code Generation
arXiv:2604.03922v1 Announce Type: new Abstract: Selecting LLM-generated code candidates using LLM-generated tests is challenging because the tests themselves may be incorrect. Existing methods either treat all tests equally or rely on ad-hoc heuristics to filter unreliable tests. Yet determining test...
GeoBrowse: A Geolocation Benchmark for Agentic Tool Use with Expert-Annotated Reasoning Traces
arXiv:2604.04017v1 Announce Type: new Abstract: Deep research agents integrate fragmented evidence through multi-step tool use. BrowseComp offers a text-only testbed for such agents, but existing multimodal benchmarks rarely require both weak visual cues composition and BrowseComp-style multi-hop verification. Geolocation is...
Readable Minds: Emergent Theory-of-Mind-Like Behavior in LLM Poker Agents
arXiv:2604.04157v1 Announce Type: new Abstract: Theory of Mind (ToM) -- the ability to model others' mental states -- is fundamental to human social cognition. Whether large language models (LLMs) can develop ToM has been tested exclusively through static vignettes, leaving...
Evaluating Artificial Intelligence Through a Christian Understanding of Human Flourishing
arXiv:2604.03356v1 Announce Type: new Abstract: Artificial intelligence (AI) alignment is fundamentally a formation problem, not only a safety problem. As Large Language Models (LLMs) increasingly mediate moral deliberation and spiritual inquiry, they do more than provide information; they function as...
Understanding the Nature of Generative AI as Threshold Logic in High-Dimensional Space
arXiv:2604.02476v1 Announce Type: new Abstract: This paper examines the role of threshold logic in understanding generative artificial intelligence. Threshold functions, originally studied in the 1960s in digital circuit synthesis, provide a structurally transparent model of neural computation: a weighted sum...
ROMAN: A Multiscale Routing Operator for Convolutional Time Series Models
arXiv:2604.02577v1 Announce Type: new Abstract: We introduce ROMAN (ROuting Multiscale representAtioN), a deterministic operator for time series that maps temporal scale and coarse temporal position into an explicit channel structure while reducing sequence length. ROMAN builds an anti-aliased multiscale pyramid,...
InfoSeeker: A Scalable Hierarchical Parallel Agent Framework for Web Information Seeking
arXiv:2604.02971v1 Announce Type: new Abstract: Recent agentic search systems have made substantial progress by emphasising deep, multi-step reasoning. However, this focus often overlooks the challenges of wide-scale information synthesis, where agents must aggregate large volumes of heterogeneous evidence across many...
StoryScope: Investigating idiosyncrasies in AI fiction
arXiv:2604.03136v1 Announce Type: new Abstract: As AI-generated fiction becomes increasingly prevalent, questions of authorship and originality are becoming central to how written work is evaluated. While most existing work in this space focuses on identifying surface-level signatures of AI writing,...
Characterizing WebGPU Dispatch Overhead for LLM Inference Across Four GPU Vendors, Three Backends, and Three Browsers
arXiv:2604.02344v1 Announce Type: new Abstract: WebGPU's security-focused design imposes per-operation validation that compounds across the many small dispatches in neural network inference, yet the true cost of this overhead is poorly characterized. We present a systematic characterization of WebGPU dispatch...
LiME: Lightweight Mixture of Experts for Efficient Multimodal Multi-task Learning
arXiv:2604.02338v1 Announce Type: new Abstract: MoE-PEFT methods combine Mixture of Experts with parameter-efficient fine-tuning for multi-task adaptation, but require separate adapters per expert causing trainable parameters to scale linearly with expert count and limiting applicability to adapter-based architectures. We propose...
Redirected, Not Removed: Task-Dependent Stereotyping Reveals the Limits of LLM Alignments
arXiv:2604.02669v1 Announce Type: new Abstract: How biased is a language model? The answer depends on how you ask. A model that refuses to choose between castes for a leadership role will, in a fill-in-the-blank task, reliably associate upper castes with...
Compositional Neuro-Symbolic Reasoning
arXiv:2604.02434v1 Announce Type: new Abstract: We study structured abstraction-based reasoning for the Abstraction and Reasoning Corpus (ARC) and compare its generalization to test-time approaches. Purely neural architectures lack reliable combinatorial generalization, while strictly symbolic systems struggle with perceptual grounding. We...
What oral argument told us in the birthright citizenship case
Empirical SCOTUS is a recurring series by Adam Feldman that looks at Supreme Court data, primarily in the form of opinions and oral arguments, to provide insights into the justices’ decision making and […]The postWhat oral argument told us in...
The Privileges or Immunities Clause, Abridged: A Critique of Kurt Lash on the Fourteenth Amendment
ARTICLE The Privileges or Immunities Clause, Abridged: A Critique of Kurt Lash on the Fourteenth Amendment Randy E. Barnett* & Evan D. Bernick** The Privileges or Immunities Clause of the Fourteenth Amendment reads: “No State shall make or enforce any...
What’s new for the Position Paper Track at NeurIPS 2026
Trump attends birthright citizenship argument
Updated on April 1 at 7:48 p.m. As soon as President Donald Trump last evening mentioned attending argument in the birthright citizenship case in Trump v. Barbara today, some Supreme […]The postTrump attends birthright citizenship argumentappeared first onSCOTUSblog.
Retrospective on PAT x ICML 2026 AI Paper Assistant Program
Find Your Next Job
Association for the Advancement of Artificial Intelligence (AAAI) - Find your next career at AAAI Career Center. Check back frequently as new jobs are posted every day.
DDCL: Deep Dual Competitive Learning: A Differentiable End-to-End Framework for Unsupervised Prototype-Based Representation Learning
arXiv:2604.01740v1 Announce Type: new Abstract: A persistent structural weakness in deep clustering is the disconnect between feature learning and cluster assignment. Most architectures invoke an external clustering step, typically k-means, to produce pseudo-labels that guide training, preventing the backbone from...
Can LLMs Perceive Time? An Empirical Investigation
arXiv:2604.00010v1 Announce Type: cross Abstract: Large language models cannot estimate how long their own tasks take. We investigate this limitation through four experiments across 68 tasks and four model families. Pre-task estimates overshoot actual duration by 4--7$\times$ ($p < 0.001$),...
Birthright citizenship live blog for Wednesday, April 1
On Wednesday, April 1, we will be live blogging as the court hears argument in Trump v. Barbara, on the constitutionality of President Donald Trump’s executive order on birthright citizenship. […]The postBirthright citizenship live blog for Wednesday, April 1appeared first...
SCOTUStoday for Wednesday, April 1
This morning, the court will hear argument in the birthright citizenship case, Trump v. Barbara. We will be live blogging beginning at 9:30 a.m. EDT. For a great introduction to […]The postSCOTUStoday for Wednesday, April 1appeared first onSCOTUSblog.
Advisory Opinions broadcast: President Donald Trump and birthright citizenship
Oral arguments in Trump v. Barbara, on the constitutionality of President Donald Trump’s executive order on birthright citizenship, have concluded, but the conversation isn’t over. Listen now to a special […]The postAdvisory Opinions broadcast: President Donald Trump and birthright citizenshipappeared...