Bi-Lipschitz Autoencoder With Injectivity Guarantee
arXiv:2604.06701v1 Announce Type: new Abstract: Autoencoders are widely used for dimensionality reduction, based on the assumption that high-dimensional data lies on low-dimensional manifolds. Regularized autoencoders aim to preserve manifold geometry during dimensionality reduction, but existing approaches often suffer from non-injective...
PD-SOVNet: A Physics-Driven Second-Order Vibration Operator Network for Estimating Wheel Polygonal Roughness from Axle-Box Vibrations
arXiv:2604.06620v1 Announce Type: new Abstract: Quantitative estimation of wheel polygonal roughness from axle-box vibration signals is a challenging yet practically relevant problem for rail-vehicle condition monitoring. Existing studies have largely focused on detection, identification, or severity classification, while continuous regression...
The Master Key Hypothesis: Unlocking Cross-Model Capability Transfer via Linear Subspace Alignment
arXiv:2604.06377v1 Announce Type: new Abstract: We investigate whether post-trained capabilities can be transferred across models without retraining, with a focus on transfer across different model scales. We propose the Master Key Hypothesis, which states that model capabilities correspond to directions...
Bridging Theory and Practice in Crafting Robust Spiking Reservoirs
arXiv:2604.06395v1 Announce Type: new Abstract: Spiking reservoir computing provides an energy-efficient approach to temporal processing, but reliably tuning reservoirs to operate at the edge-of-chaos is challenging due to experimental uncertainty. This work bridges abstract notions of criticality and practical stability...
Conformal Margin Risk Minimization: An Envelope Framework for Robust Learning under Label Noise
arXiv:2604.06468v1 Announce Type: new Abstract: Most methods for learning with noisy labels require privileged knowledge such as noise transition matrices, clean subsets or pretrained feature extractors, resources typically unavailable when robustness is most needed. We propose Conformal Margin Risk Minimization...
From Load Tests to Live Streams: Graph Embedding-Based Anomaly Detection in Microservice Architectures
arXiv:2604.06448v1 Announce Type: new Abstract: Prime Video regularly conducts load tests to simulate the viewer traffic spikes seen during live events such as Thursday Night Football as well as video-on-demand (VOD) events such as Rings of Power. While these stress...
Quality-preserving Model for Electronics Production Quality Tests Reduction
arXiv:2604.06451v1 Announce Type: new Abstract: Manufacturing test flows in high-volume electronics production are typically fixed during product development and executed unchanged on every unit, even as failure patterns and process conditions evolve. This protects quality, but it also imposes unnecessary...
TwinLoop: Simulation-in-the-Loop Digital Twins for Online Multi-Agent Reinforcement Learning
arXiv:2604.06610v1 Announce Type: new Abstract: Decentralised online learning enables runtime adaptation in cyber-physical multi-agent systems, but when operating conditions change, learned policies often require substantial trial-and-error interaction before recovering performance. To address this, we propose TwinLoop, a simulation-in-the-loop digital twin...
Astropad’s Workbench reimagines remote desktop for AI agents, not IT support
Astropad’s Workbench lets users remotely monitor and control AI agents on Mac Minis from iPhone or iPad, with low-latency streaming and mobile access.
Fine-tuning Whisper for Pashto ASR: strategies and scale
arXiv:2604.06507v1 Announce Type: new Abstract: Pashto is absent from Whisper's pre-training corpus despite being one of CommonVoice's largest language collections, leaving off-the-shelf models unusable: all Whisper sizes output Arabic, Dari, or Urdu script on Pashto audio, achieving word error rates...
Scoring Edit Impact in Grammatical Error Correction via Embedded Association Graphs
arXiv:2604.06573v1 Announce Type: new Abstract: A Grammatical Error Correction (GEC) system produces a sequence of edits to correct an erroneous sentence. The quality of these edits is typically evaluated against human annotations. However, a sentence may admit multiple valid corrections,...
Spectral Edge Dynamics Reveal Functional Modes of Learning
arXiv:2604.06256v1 Announce Type: new Abstract: Training dynamics during grokking concentrate along a small number of dominant update directions -- the spectral edge -- which reliably distinguishes grokking from non-grokking regimes. We show that standard mechanistic interpretability tools (head attribution, activation...
$S^3$: Stratified Scaling Search for Test-Time in Diffusion Language Models
arXiv:2604.06260v1 Announce Type: new Abstract: Test-time scaling investigates whether a fixed diffusion language model (DLM) can generate better outputs when given more inference compute, without additional training. However, naive best-of-$K$ sampling is fundamentally limited because it repeatedly draws from the...
MO-RiskVAE: A Multi-Omics Variational Autoencoder for Survival Risk Modeling in Multiple MyelomaMO-RiskVAE
arXiv:2604.06267v1 Announce Type: new Abstract: Multimodal variational autoencoders (VAEs) have emerged as a powerful framework for survival risk modeling in multiple myeloma by integrating heterogeneous omics and clinical data. However, when trained under survival supervision, standard latent regularization strategies often...
A Supreme Court status report
In early January, as the country eagerly awaited a tariffs ruling that – as it turned out – was still more than a month away, Supreme Court watchers raised concerns […]The postA Supreme Court status reportappeared first onSCOTUSblog.
LinkedIn scanning users' browser extensions sparks controversy and two lawsuits
LinkedIn says claims fabricated by extension maker suspended for scraping data.
Tankers passing through Strait of Hormuz will have to pay cryptocurrency toll
Any tanker passing must reveal its cargo so Iran can determine transit fee amount.
Poke makes using AI agents as easy as sending a text
Poke brings AI agents to everyday users via text message by handling tasks and automations without complex setup, apps, or technical know-how.
Supreme Court summarily closes the courthouse doors again
Civil Rights and Wrongs is a recurring series by Daniel Harawa covering criminal justice and civil rights cases before the court. I have written before about the Supreme Court’s troubling […]The postSupreme Court summarily closes the courthouse doors againappeared first...
Efficient Quantization of Mixture-of-Experts with Theoretical Generalization Guarantees
arXiv:2604.06515v1 Announce Type: new Abstract: Sparse Mixture-of-Experts (MoE) allows scaling of language and vision models efficiently by activating only a small subset of experts per input. While this reduces computation, the large number of parameters still incurs substantial memory overhead...
Depression Detection at the Point of Care: Automated Analysis of Linguistic Signals from Routine Primary Care Encounters
arXiv:2604.06193v1 Announce Type: new Abstract: Depression is underdiagnosed in primary care, yet timely identification remains critical. Recorded clinical encounters, increasingly common with digital scribing technologies, present an opportunity to detect depression from naturalistic dialogue. We investigated automated depression detection from...
Databricks co-founder wins prestigious ACM award, says ‘AGI is here already’
Matei Zaharia has won the top honor from the Association for Computing Machinery. Now he's working on AI for research and says AGI is simply misunderstood.
Hallucination as output-boundary misclassification: a composite abstention architecture for language models
arXiv:2604.06195v1 Announce Type: new Abstract: Large language models often produce unsupported claims. We frame this as a misclassification error at the output boundary, where internally generated completions are emitted as if they were grounded in evidence. This motivates a composite...
Bi-level Heterogeneous Learning for Time Series Foundation Models: A Federated Learning Approach
arXiv:2604.06727v1 Announce Type: new Abstract: Heterogeneity in time series data is more pronounced than in vision or language, as temporal dynamics vary substantially across domains and tasks. Existing efforts on training time series foundation models (TSFMs) from scratch are often...
AWS boss explains why investing billions in both Anthropic and OpenAI is an OK conflict
AWS has an ingrained culture of handling competition, he explained, because the cloud giant also competes with its partners.
The Rhetoric of Machine Learning
arXiv:2604.06754v1 Announce Type: new Abstract: I examine the technology of machine learning from the perspective of rhetoric, which is simply the art of persuasion. Rather than being a neutral and "objective" way to build "world models" from data, machine learning...
Severity-Aware Weighted Loss for Arabic Medical Text Generation
arXiv:2604.06346v1 Announce Type: new Abstract: Large language models have shown strong potential for Arabic medical text generation; however, traditional fine-tuning objectives treat all medical cases uniformly, ignoring differences in clinical severity. This limitation is particularly critical in healthcare settings, where...
In-Context Learning in Speech Language Models: Analyzing the Role of Acoustic Features, Linguistic Structure, and Induction Heads
arXiv:2604.06356v1 Announce Type: new Abstract: In-Context Learning (ICL) has been extensively studied in text-only Language Models, but remains largely unexplored in the speech domain. Here, we investigate how linguistic and acoustic features affect ICL in Speech Language Models. We focus...
Drifting Fields are not Conservative
arXiv:2604.06333v1 Announce Type: new Abstract: Drifting models generate high-quality samples in a single forward pass by transporting generated samples toward the data distribution using a vector valued drift field. We investigate whether this procedure is equivalent to optimizing a scalar...
A Severity-Based Curriculum Learning Strategy for Arabic Medical Text Generation
arXiv:2604.06365v1 Announce Type: new Abstract: Arabic medical text generation is increasingly needed to help users interpret symptoms and access general health guidance in their native language. Nevertheless, many existing methods assume uniform importance across training samples, overlooking differences in clinical...