Equivariant Asynchronous Diffusion: An Adaptive Denoising Schedule for Accelerated Molecular Conformation Generation
arXiv:2603.10093v1 Announce Type: new Abstract: Recent 3D molecular generation methods primarily use asynchronous auto-regressive or synchronous diffusion models. While auto-regressive models build molecules sequentially, they're limited by a short horizon and a discrepancy between training and inference. Conversely, synchronous diffusion...
Denoising the US Census: Succinct Block Hierarchical Regression
arXiv:2603.10099v1 Announce Type: new Abstract: The US Census Bureau Disclosure Avoidance System (DAS) balances confidentiality and utility requirements for the decennial US Census (Abowd et al., 2022). The DAS was used in the 2020 Census to produce demographic datasets critically...
Hardware Efficient Approximate Convolution with Tunable Error Tolerance for CNNs
arXiv:2603.10100v1 Announce Type: new Abstract: Modern CNNs' high computational demands hinder edge deployment, as traditional ``hard'' sparsity (skipping mathematical zeros) loses effectiveness in deep layers or with smooth activations like Tanh. We propose a ``soft sparsity'' paradigm using a hardware...
CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR
arXiv:2603.10101v1 Announce Type: new Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has significantly advanced the reasoning capacity of Large Language Models (LLMs). However, RLVR solely relies on final answers as outcome rewards, neglecting the correctness of intermediate reasoning steps. Training...
Lost in the Middle at Birth: An Exact Theory of Transformer Position Bias
arXiv:2603.10123v1 Announce Type: new Abstract: The ``Lost in the Middle'' phenomenon -- a U-shaped performance curve where LLMs retrieve well from the beginning and end of a context but fail in the middle -- is widely attributed to learned Softmax...
Mashup Learning: Faster Finetuning by Remixing Past Checkpoints
arXiv:2603.10156v1 Announce Type: new Abstract: Finetuning on domain-specific data is a well-established method for enhancing LLM performance on downstream tasks. Training on each dataset produces a new set of model weights, resulting in a multitude of checkpoints saved in-house or...
DT-BEHRT: Disease Trajectory-aware Transformer for Interpretable Patient Representation Learning
arXiv:2603.10180v1 Announce Type: new Abstract: The growing adoption of electronic health record (EHR) systems has provided unprecedented opportunities for predictive modeling to guide clinical decision making. Structured EHRs contain longitudinal observations of patients across hospital visits, where each visit is...
Rethinking the Harmonic Loss via Non-Euclidean Distance Layers
arXiv:2603.10225v1 Announce Type: new Abstract: Cross-entropy loss has long been the standard choice for training deep neural networks, yet it suffers from interpretability limitations, unbounded weight growth, and inefficiencies that can contribute to costly training dynamics. The harmonic loss is...
SiMPO: Measure Matching for Online Diffusion Reinforcement Learning
arXiv:2603.10250v1 Announce Type: new Abstract: A commonly used family of RL algorithms for diffusion policies conducts softmax reweighting over the behavior policy, which usually induces an over-greedy policy and fails to leverage feedback from negative samples. In this work, we...
Improving TabPFN's Synthetic Data Generation by Integrating Causal Structure
arXiv:2603.10254v1 Announce Type: new Abstract: Synthetic tabular data generation addresses data scarcity and privacy constraints in a variety of domains. Tabular Prior-Data Fitted Network (TabPFN), a recent foundation model for tabular data, has been shown capable of generating high-quality synthetic...
Discovery of a Hematopoietic Manifold in scGPT Yields a Method for Extracting Performant Algorithms from Biological Foundation Model Internals
arXiv:2603.10261v1 Announce Type: new Abstract: We report the discovery and extraction of a compact hematopoietic algorithm from the single-cell foundation model scGPT, to our knowledge the first biologically useful, competitive algorithm extracted from a foundation model via mechanistic interpretability. We...
Robust Post-Training for Generative Recommenders: Why Exponential Reward-Weighted SFT Outperforms RLHF
arXiv:2603.10279v1 Announce Type: new Abstract: Aligning generative recommender systems to user preferences via post-training is critical for closing the gap between next-item prediction and actual recommendation quality. Existing post-training methods are ill-suited for production-scale systems: RLHF methods reward hack due...
Taming Score-Based Denoisers in ADMM: A Convergent Plug-and-Play Framework
arXiv:2603.10281v1 Announce Type: new Abstract: While score-based generative models have emerged as powerful priors for solving inverse problems, directly integrating them into optimization algorithms such as ADMM remains nontrivial. Two central challenges arise: i) the mismatch between the noisy data...
Copula-ResLogit: A Deep-Copula Framework for Unobserved Confounding Effects
arXiv:2603.10284v1 Announce Type: new Abstract: A key challenge in travel demand analysis is the presence of unobserved factors that may generate non-causal dependencies, obscuring the true causal effects. To address the issue, the study introduces a novel deep learning based...
Regime-aware financial volatility forecasting via in-context learning
arXiv:2603.10299v1 Announce Type: new Abstract: This work introduces a regime-aware in-context learning framework that leverages large language models (LLMs) for financial volatility forecasting under nonstationary market conditions. The proposed approach deploys pretrained LLMs to reason over historical volatility patterns and...
How to make the most of your masked language model for protein engineering
arXiv:2603.10302v1 Announce Type: new Abstract: A plethora of protein language models have been released in recent years. Yet comparatively little work has addressed how to best sample from them to optimize desired biological properties. We fill this gap by proposing...
Data-Driven Integration Kernels for Interpretable Nonlocal Operator Learning
arXiv:2603.10305v1 Announce Type: new Abstract: Machine learning models can represent climate processes that are nonlocal in horizontal space, height, and time, often by combining information across these dimensions in highly nonlinear ways. While this can improve predictive skill, it makes...
Optimal Expert-Attention Allocation in Mixture-of-Experts: A Scalable Law for Dynamic Model Design
arXiv:2603.10379v1 Announce Type: new Abstract: This paper presents a novel extension of neural scaling laws to Mixture-of-Experts (MoE) models, focusing on the optimal allocation of compute between expert and attention sub-layers. As MoE architectures have emerged as an efficient method...
Variance-Aware Adaptive Weighting for Diffusion Model Training
arXiv:2603.10391v1 Announce Type: new Abstract: Diffusion models have recently achieved remarkable success in generative modeling, yet their training dynamics across different noise levels remain highly imbalanced, which can lead to inefficient optimization and unstable learning behavior. In this work, we...
Graph-GRPO: Training Graph Flow Models with Reinforcement Learning
arXiv:2603.10395v1 Announce Type: new Abstract: Graph generation is a fundamental task with broad applications, such as drug discovery. Recently, discrete flow matching-based graph generation, \aka, graph flow model (GFM), has emerged due to its superior performance and flexible sampling. However,...
On the Learning Dynamics of Two-layer Linear Networks with Label Noise SGD
arXiv:2603.10397v1 Announce Type: new Abstract: One crucial factor behind the success of deep learning lies in the implicit bias induced by noise inherent in gradient-based training algorithms. Motivated by empirical observations that training with noisy labels improves model generalization, we...
The 14th Amendment’s citizenship clause does not codify English principles of subjectship
Critics and supporters of President Donald Trump’s executive order on birthright citizenship often focus on the order’s barring of automatic citizenship to children born to individuals unlawfully present in the […]The postThe 14th Amendment’s citizenship clause does not codify English...
Amazon expands a program that lets customers shop from other retailers’ sites
The changes allow more merchants to participate in Amazon's Shop Direct program, which sends Amazon customers to other retailers' websites.
Sun Valley Orchards, LLCv. United States Department of Labor
In SEC v. Jarkesy, the Supreme Court failed to fully clarify the “unquestionably muddy” relationship between Article III and the Seventh Amendment. Yet it...The post<em>Sun Valley Orchards, LLC<br>v. United States Department of Labor</em>appeared first onHarvard Law Review.
Investigating Gender Stereotypes in Large Language Models via Social Determinants of Health
arXiv:2603.09416v1 Announce Type: new Abstract: Large Language Models (LLMs) excel in Natural Language Processing (NLP) tasks, but they often propagate biases embedded in their training data, which is potentially impactful in sensitive domains like healthcare. While existing benchmarks evaluate biases...
EPOCH: An Agentic Protocol for Multi-Round System Optimization
arXiv:2603.09049v1 Announce Type: new Abstract: Autonomous agents are increasingly used to improve prompts, code, and machine learning systems through iterative execution and feedback. Yet existing approaches are usually designed as task-specific optimization loops rather than as a unified protocol for...
Explainable Innovation Engine: Dual-Tree Agent-RAG with Methods-as-Nodes and Verifiable Write-Back
arXiv:2603.09192v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) improves factual grounding, yet most systems rely on flat chunk retrieval and provide limited control over multi-step synthesis. We propose an Explainable Innovation Engine that upgrades the knowledge unit from text chunks...
The Reasoning Trap -- Logical Reasoning as a Mechanistic Pathway to Situational Awareness
arXiv:2603.09200v1 Announce Type: new Abstract: Situational awareness, the capacity of an AI system to recognize its own nature, understand its training and deployment context, and reason strategically about its circumstances, is widely considered among the most dangerous emergent capabilities in...
PRECEPT: Planning Resilience via Experience, Context Engineering & Probing Trajectories A Unified Framework for Test-Time Adaptation with Compositional Rule Learning and Pareto-Guided Prompt Evolution
arXiv:2603.09641v1 Announce Type: new Abstract: LLM agents that store knowledge as natural language suffer steep retrieval degradation as condition count grows, often struggle to compose learned rules reliably, and typically lack explicit mechanisms to detect stale or adversarial knowledge. We...
EsoLang-Bench: Evaluating Genuine Reasoning in Large Language Models via Esoteric Programming Languages
arXiv:2603.09678v1 Announce Type: new Abstract: Large language models achieve near-ceiling performance on code generation benchmarks, yet these results increasingly reflect memorization rather than genuine reasoning. We introduce EsoLang-Bench, a benchmark using five esoteric programming languages (Brainfuck, Befunge-98, Whitespace, Unlambda, and...