Replaying pre-training data improves fine-tuning
arXiv:2603.04964v1 Announce Type: new Abstract: To obtain a language model for a target domain (e.g. math), the current paradigm is to pre-train on a vast amount of generic web text and then fine-tune on the relatively limited amount of target...
FedEMA-Distill: Exponential Moving Average Guided Knowledge Distillation for Robust Federated Learning
arXiv:2603.04422v1 Announce Type: new Abstract: Federated learning (FL) often degrades when clients hold heterogeneous non-Independent and Identically Distributed (non-IID) data and when some clients behave adversarially, leading to client drift, slow convergence, and high communication overhead. This paper proposes FedEMA-Distill,...
Delta-Crosscoder: Robust Crosscoder Model Diffing in Narrow Fine-Tuning Regimes
arXiv:2603.04426v1 Announce Type: new Abstract: Model diffing methods aim to identify how fine-tuning changes a model's internal representations. Crosscoders approach this by learning shared dictionaries of interpretable latent directions between base and fine-tuned models. However, existing formulations struggle with narrow...
Thin Keys, Full Values: Reducing KV Cache via Low-Dimensional Attention Selection
arXiv:2603.04427v1 Announce Type: new Abstract: Standard transformer attention uses identical dimensionality for queries, keys, and values ($d_q = d_k = d_v = \dmodel$). Our insight is that these components serve fundamentally different roles, and this symmetry is unnecessary. Queries and...
Flowers: A Warp Drive for Neural PDE Solvers
arXiv:2603.04430v1 Announce Type: new Abstract: We introduce Flowers, a neural architecture for learning PDE solution operators built entirely from multihead warps. Aside from pointwise channel mixing and a multiscale scaffold, Flowers use no Fourier multipliers, no dot-product attention, and no...
Uncertainty-Calibrated Spatiotemporal Field Diffusion with Sparse Supervision
arXiv:2603.04431v1 Announce Type: new Abstract: Physical fields are typically observed only at sparse, time-varying sensor locations, making forecasting and reconstruction ill-posed and uncertainty-critical. We present SOLID, a mask-conditioned diffusion framework that learns spatiotemporal dynamics from sparse observations alone: training and...
Learning Unified Distance Metric for Heterogeneous Attribute Data Clustering
arXiv:2603.04458v1 Announce Type: new Abstract: Datasets composed of numerical and categorical attributes (also called mixed data hereinafter) are common in real clustering tasks. Differing from numerical attributes that indicate tendencies between two concepts (e.g., high and low temperature) with their...
VSPrefill: Vertical-Slash Sparse Attention with Lightweight Indexing for Long-Context Prefilling
arXiv:2603.04460v1 Announce Type: new Abstract: The quadratic complexity of self-attention during the prefill phase impedes long-context inference in large language models. Existing sparse attention methods face a trade-off among context adaptivity, sampling overhead, and fine-tuning costs. We propose VSPrefill, a...
Standing on the Shoulders of Giants: Rethinking EEG Foundation Model Pretraining via Multi-Teacher Distillation
arXiv:2603.04478v1 Announce Type: new Abstract: Pretraining for electroencephalogram (EEG) foundation models has predominantly relied on self-supervised masked reconstruction, a paradigm largely adapted from and inspired by the success of vision and language foundation models. However, unlike images and text, EEG...
Augmenting representations with scientific papers
arXiv:2603.04516v1 Announce Type: new Abstract: Astronomers have acquired vast repositories of multimodal data, including images, spectra, and time series, complemented by decades of literature that analyzes astrophysical sources. Still, these data sources are rarely systematically integrated. This work introduces a...
Invariant Causal Routing for Governing Social Norms in Online Market Economies
arXiv:2603.04534v1 Announce Type: new Abstract: Social norms are stable behavioral patterns that emerge endogenously within economic systems through repeated interactions among agents. In online market economies, such norms -- like fair exposure, sustained participation, and balanced reinvestment -- are critical...
PDE foundation model-accelerated inverse estimation of system parameters in inertial confinement fusion
arXiv:2603.04606v1 Announce Type: new Abstract: PDE foundation models are typically pretrained on large, diverse corpora of PDE datasets and can be adapted to new settings with limited task-specific data. However, most downstream evaluations focus on forward problems, such as autoregressive...
When Sensors Fail: Temporal Sequence Models for Robust PPO under Sensor Drift
arXiv:2603.04648v1 Announce Type: new Abstract: Real-world reinforcement learning systems must operate under distributional drift in their observation streams, yet most policy architectures implicitly assume fully observed and noise-free states. We study robustness of Proximal Policy Optimization (PPO) under temporally persistent...
Engineering Regression Without Real-Data Training: Domain Adaptation for Tabular Foundation Models Using Multi-Dataset Embeddings
arXiv:2603.04692v1 Announce Type: new Abstract: Predictive modeling in engineering applications has long been dominated by bespoke models and small, siloed tabular datasets, limiting the applicability of large-scale learning approaches. Despite recent progress in tabular foundation models, the resulting synthetic training...
Probabilistic Dreaming for World Models
arXiv:2603.04715v1 Announce Type: new Abstract: "Dreaming" enables agents to learn from imagined experiences, enabling more robust and sample-efficient learning of world models. In this work, we consider innovations to the state-of-the-art Dreamer model using probabilistic methods that enable: (1) the...
Count Bridges enable Modeling and Deconvolving Transcriptomic Data
arXiv:2603.04730v1 Announce Type: new Abstract: Many modern biological assays, including RNA sequencing, yield integer-valued counts that reflect the number of molecules detected. These measurements are often not at the desired resolution: while the unit of interest is typically a single...
When Priors Backfire: On the Vulnerability of Unlearnable Examples to Pretraining
arXiv:2603.04731v1 Announce Type: new Abstract: Unlearnable Examples (UEs) serve as a data protection strategy that generates imperceptible perturbations to mislead models into learning spurious correlations instead of underlying semantics. In this paper, we uncover a fundamental vulnerability of UEs that...
ConTSG-Bench: A Unified Benchmark for Conditional Time Series Generation
arXiv:2603.04767v1 Announce Type: new Abstract: Conditional time series generation plays a critical role in addressing data scarcity and enabling causal analysis in real-world applications. Despite its increasing importance, the field lacks a standardized and systematic benchmarking framework for evaluating generative...
Distributional Reinforcement Learning with Information Bottleneck for Uncertainty-Aware DRAM Equalization
arXiv:2603.04768v1 Announce Type: new Abstract: Equalizer parameter optimization is critical for signal integrity in high-speed memory systems operating at multi-gigabit data rates. However, existing methods suffer from computationally expensive eye diagram evaluation, optimization of expected rather than worst-case performance, and...
FedAFD: Multimodal Federated Learning via Adversarial Fusion and Distillation
arXiv:2603.04890v1 Announce Type: new Abstract: Multimodal Federated Learning (MFL) enables clients with heterogeneous data modalities to collaboratively train models without sharing raw data, offering a privacy-preserving framework that leverages complementary cross-modal information. However, existing methods often overlook personalized client performance...
EVMbench: Evaluating AI Agents on Smart Contract Security
arXiv:2603.04915v1 Announce Type: new Abstract: Smart contracts on public blockchains now manage large amounts of value, and vulnerabilities in these systems can lead to substantial losses. As AI agents become more capable at reading, writing, and running code, it is...
Immigration Enforcement and Constraints on Information Commandeering
The debate over American immigration policy reflects deep moral divides over the meaning of American identity and the scope of fundamental individual rights like due process and the freedom of movement. Although the modern American immigration system no longer includes...
State Anti-Doxing Statutes and #MeToo
In August 2014, a programmer named Eron Gjoni posted a 10,000-word exposé on his blog about video game developer Zoë Quinn, including screenshots of private emails, text messages, and Facebook messages. In the several posts he published about Quinn, Gjoni...
The Non-Punishment Principle and Restorative Justice
The non-punishment principle is a legal norm that has increasingly gained legitimacy over the past quarter-century within international, regional, and domestic law on human trafficking. At its core, this principle opposes the punishment of human trafficking victims for unlawful conduct...
The Constitutionality of Indiscriminate Data Surveillance
Soon enough, the police will have the capacity to know almost everything about everyone. Not because most of us are suspected of doing anything wrong, but because indiscriminate data surveillance—“indiscriminate” meaning precisely that it is not driven by individualized suspicion...
Power and Immunity in Youngstown and Trump v. United States
Introduction When the Supreme Court handed down its decision in Trump v. United States granting ex-presidents a broad new immunity from criminal prosecution, it ensured that President Donald Trump would likely never face criminal accountability for his efforts to remain...
Sign in
Login to LinkedIn to keep in touch with people you know, share ideas, and build your career.
Justices poised to adopt exceptions to federal criminal defendants’ appellate waivers
The Supreme Court heard oral argument on Tuesday in Hunter v. United States about what exceptions exist to federal defendants’ waivers of their right to appeal. The justices seemed poised […]The postJustices poised to adopt exceptions to federal criminal defendants’...
Birthright citizenship: the exceptions provide the rule
The battle over birthright citizenship is a battle over its exceptions. The 14th Amendment’s first sentence proudly proclaims that “[a]ll persons born . . . in the United States, and subject to the jurisdiction […]The postBirthright citizenship: the exceptions provide...
Syrian nationals urge Supreme Court to keep ruling in place allowing them to stay in the United States
A group of Syrian nationals urged the Supreme Court on Thursday to leave in place a ruling by a federal judge in New York City that allows them to remain […]The postSyrian nationals urge Supreme Court to keep ruling in...