Tula: Optimizing Time, Cost, and Generalization in Distributed Large-Batch Training
arXiv:2603.18112v1 Announce Type: new Abstract: Distributed training increases the number of batches processed per iteration either by scaling-out (adding more nodes) or scaling-up (increasing the batch-size). However, the largest configuration does not necessarily yield the best performance. Horizontal scaling introduces...
AGRI-Fidelity: Evaluating the Reliability of Listenable Explanations for Poultry Disease Detection
arXiv:2603.18247v1 Announce Type: new Abstract: Existing XAI metrics measure faithfulness for a single model, ignoring model multiplicity where near-optimal classifiers rely on different or spurious acoustic cues. In noisy farm environments, stationary artifacts such as ventilation noise can produce explanations...
On Additive Gaussian Processes for Wind Farm Power Prediction
arXiv:2603.18281v1 Announce Type: new Abstract: Population-based Structural Health Monitoring (PBSHM) aims to share information between similar machines or structures. This paper takes a population-level perspective, exploring the use of additive Gaussian processes to reveal variations in turbine-specific and farm-level power...
Escaping Offline Pessimism: Vector-Field Reward Shaping for Safe Frontier Exploration
arXiv:2603.18326v1 Announce Type: new Abstract: While offline reinforcement learning provides reliable policies for real-world deployment, its inherent pessimism severely restricts an agent's ability to explore and collect novel data online. Drawing inspiration from safe reinforcement learning, exploring near the boundary...
Epistemic Generative Adversarial Networks
arXiv:2603.18348v1 Announce Type: new Abstract: Generative models, particularly Generative Adversarial Networks (GANs), often suffer from a lack of output diversity, frequently generating similar samples rather than a wide range of variations. This paper introduces a novel generalization of the GAN...
Towards Noise-Resilient Quantum Multi-Armed and Stochastic Linear Bandits
arXiv:2603.18431v1 Announce Type: new Abstract: Quantum multi-armed bandits (MAB) and stochastic linear bandits (SLB) have recently attracted significant attention, as their quantum counterparts can achieve quadratic speedups over classical MAB and SLB. However, most existing quantum MAB algorithms assume ideal...
FBI started buying Americans' location data again, Kash Patel confirms
Tom Cotton supports FBI data purchasing, compares it to searching people's trash.
Afroman keeps trolling cops after winning “Lemon Pound Cake” defamation case
Cops asked the jury for millions after Afroman used raid footage in music videos.
Meta rolls out new AI content enforcement systems while reducing reliance on third-party vendors
Meta believes these AI systems can detect more violations with greater accuracy, better prevent scams, respond more quickly to real-world events, and reduce over-enforcement.
DoorDash launches a new ‘Tasks’ app that pays couriers to submit videos to train AI
Delivery couriers will be able to earn money by completing activities like filming everyday tasks or recording themselves speaking in another language.
TechCrunch Startup Battlefield 200 nominations are still open
Nominate your startup, or one you know, for TechCrunch Startup Battlefield 200 before May 27. Chance to win $100,000 equity-free funding and VC access.
Multiverse Computing pushes its compressed AI models into the mainstream
After compressing models from major AI labs, including OpenAI, Meta, DeepSeek, and Mistral AI, Multiverse Computing has launched both an app that showcases the capabilities of its compressed models and an API that makes them more widely available.
A foundation model for electrodermal activity data
arXiv:2603.16878v1 Announce Type: new Abstract: Foundation models have recently extended beyond natural language and vision to timeseries domains, including physiological signals. However, progress in electrodermal activity (EDA) modeling is hindered by the absence of large-scale, curated, and openly accessible datasets....
MHPO: Modulated Hazard-aware Policy Optimization for Stable Reinforcement Learning
arXiv:2603.16929v1 Announce Type: new Abstract: Regulating the importance ratio is critical for the training stability of Group Relative Policy Optimization (GRPO) based frameworks. However, prevailing ratio control methods, such as hard clipping, suffer from non-differentiable boundaries and vanishing gradient regions,...
Do Understanding and Generation Fight? A Diagnostic Study of DPO for Unified Multimodal Models
arXiv:2603.17044v1 Announce Type: new Abstract: Unified multimodal models share a language model backbone for both understanding and generating images. Can DPO align both capabilities simultaneously? We present the first systematic study of this question, applying DPO to Janus-Pro at 1B...
PRISM: Demystifying Retention and Interaction in Mid-Training
arXiv:2603.17074v1 Announce Type: new Abstract: We present PRISM, a comprehensive empirical study of mid-training design choices for large language models. Through controlled experiments across seven base models spanning four families (Granite, LLaMA, Mistral, Nemotron-H), two architecture types (dense Transformer and...
CircuitBuilder: From Polynomials to Circuits via Reinforcement Learning
arXiv:2603.17075v1 Announce Type: new Abstract: Motivated by auto-proof generation and Valiant's VP vs. VNP conjecture, we study the problem of discovering efficient arithmetic circuits to compute polynomials, using addition and multiplication gates. We formulate this problem as a single-player game,...
Noise-Response Calibration: A Causal Intervention Protocol for LLM-Judges
arXiv:2603.17172v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used as automated judges and synthetic labelers, especially in low-label settings. Yet these systems are stochastic and often overconfident, which makes deployment decisions difficult when external ground truth is...
Domain-informed explainable boosting machines for trustworthy lateral spread predictions
arXiv:2603.17175v1 Announce Type: new Abstract: Explainable Boosting Machines (EBMs) provide transparent predictions through additive shape functions, enabling direct inspection of feature contributions. However, EBMs can learn non-physical relationships that reduce their reliability in natural hazard applications. This study presents a...
Self-Conditioned Denoising for Atomistic Representation Learning
arXiv:2603.17196v1 Announce Type: new Abstract: The success of large-scale pretraining in NLP and computer vision has catalyzed growing efforts to develop analogous foundation models for the physical sciences. However, pretraining strategies using atomistic data remain underexplored. To date, large-scale supervised...
Causal Representation Learning on High-Dimensional Data: Benchmarks, Reproducibility, and Evaluation Metrics
arXiv:2603.17405v1 Announce Type: new Abstract: Causal representation learning (CRL) models aim to transform high-dimensional data into a latent space, enabling interventions to generate counterfactual samples or modify existing data based on the causal relationships among latent variables. To facilitate the...
The Phasor Transformer: Resolving Attention Bottlenecks on the Unit Circle
arXiv:2603.17433v1 Announce Type: new Abstract: Transformer models have redefined sequence learning, yet dot-product self-attention introduces a quadratic token-mixing bottleneck for long-context time-series. We introduce the \textbf{Phasor Transformer} block, a phase-native alternative representing sequence states on the unit-circle manifold $S^1$. Each...
Baguan-TS: A Sequence-Native In-Context Learning Model for Time Series Forecasting with Covariates
arXiv:2603.17439v1 Announce Type: new Abstract: Transformers enable in-context learning (ICL) for rapid, gradient-free adaptation in time series forecasting, yet most ICL-style approaches rely on tabularized, hand-crafted features, while end-to-end sequence models lack inference-time adaptation. We bridge this gap with a...
QuantFL: Sustainable Federated Learning for Edge IoT via Pre-Trained Model Quantisation
arXiv:2603.17507v1 Announce Type: new Abstract: Federated Learning (FL) enables privacy-preserving intelligence on Internet of Things (IoT) devices but incurs a significant carbon footprint due to the high energy cost of frequent uplink transmission. While pre-trained models are increasingly available on...
Meta is having trouble with rogue AI agents
A rogue AI agent inadvertently exposed Meta company and user data to engineers who didn't have permission to see it.
Sam Altman’s thank-you to coders draws the memes
Altman expresses gratitude for people who knew how to write their code from scratch. The internet replies with salty jokes.
Nothing CEO Carl Pei says smartphone apps will disappear as AI agents take their place
Nothing CEO Carl Pei says AI agents will eventually replace apps, shifting smartphones toward systems that understand intent and act on a user's behalf.
Patreon CEO calls AI companies’ fair use argument ‘bogus,’ says creators should be paid
Patreon CEO Jack Conte says AI companies should pay creators for training data, arguing their fair use defense falls apart when they license content from major publishers.
Rebel Audio is a new AI podcasting tool aimed at first-time creators
Rebel Audio is a new all-in-one podcasting tool that allows creators to record podcasts, edit, clip content for social, and publish episodes, all without ever leaving the platform.
The Gemini-powered features in Google Workspace that are worth using
From summarizing emails, drafting content, organizing data, and tracking meetings, here are all the best Gemini features in Google Workspace.