Google Cloud’s VP for startups on reading your ‘check engine light’ before it’s too late
Startup founders are being pushed to move faster than ever, using AI while facing tighter funding, rising infrastructure costs, and more pressure to show real traction early. Cloud credits, access to GPUs, and foundation models have made it easier to...
Open Rubric System: Scaling Reinforcement Learning with Pairwise Adaptive Rubric
arXiv:2602.14069v1 Announce Type: new Abstract: Scalar reward models compress multi-dimensional human preferences into a single opaque score, creating an information bottleneck that often leads to brittleness and reward hacking in open-ended alignment. We argue that robust alignment for non-verifiable tasks...
Empty Shelves or Lost Keys? Recall Is the Bottleneck for Parametric Factuality
arXiv:2602.14080v1 Announce Type: new Abstract: Standard factuality evaluations of LLMs treat all errors alike, obscuring whether failures arise from missing knowledge (empty shelves) or from limited access to encoded facts (lost keys). We propose a behavioral framework that profiles factual...
Knowing When Not to Answer: Abstention-Aware Scientific Reasoning
arXiv:2602.14189v1 Announce Type: new Abstract: Large language models are increasingly used to answer and verify scientific claims, yet existing evaluations typically assume that a model must always produce a definitive answer. In scientific settings, however, unsupported or uncertain conclusions can...
BLUEPRINT Rebuilding a Legacy: Multimodal Retrieval for Complex Engineering Drawings and Documents
arXiv:2602.13345v1 Announce Type: new Abstract: Decades of engineering drawings and technical records remain locked in legacy archives with inconsistent or missing metadata, making retrieval difficult and often manual. We present Blueprint, a layout-aware multimodal retrieval system designed for large-scale engineering...
Finding Highly Interpretable Prompt-Specific Circuits in Language Models
arXiv:2602.13483v1 Announce Type: new Abstract: Understanding the internal circuits that language models use to solve tasks remains a central challenge in mechanistic interpretability. Most prior work identifies circuits at the task level by averaging across many prompts, implicitly assuming a...
Singular Vectors of Attention Heads Align with Features
arXiv:2602.13524v1 Announce Type: new Abstract: Identifying feature representations in language models is a central task in mechanistic interpretability. Several recent studies have made an implicit assumption that feature representations can be inferred in some cases from singular vectors of attention...
QuaRK: A Quantum Reservoir Kernel for Time Series Learning
arXiv:2602.13531v1 Announce Type: new Abstract: Quantum reservoir computing offers a promising route for time series learning by modelling sequential data via rich quantum dynamics while the only training required happens at the level of a lightweight classical readout. However, studies...
Interpretable clustering via optimal multiway-split decision trees
arXiv:2602.13586v1 Announce Type: new Abstract: Clustering serves as a vital tool for uncovering latent data structures, and achieving both high accuracy and interpretability is essential. To this end, existing methods typically construct binary decision trees by solving mixed-integer nonlinear optimization...
Cumulative Utility Parity for Fair Federated Learning under Intermittent Client Participation
arXiv:2602.13651v1 Announce Type: new Abstract: In real-world federated learning (FL) systems, client participation is intermittent, heterogeneous, and often correlated with data characteristics or resource constraints. Existing fairness approaches in FL primarily focus on equalizing loss or accuracy conditional on participation,...
Zero-Order Optimization for LLM Fine-Tuning via Learnable Direction Sampling
arXiv:2602.13659v1 Announce Type: new Abstract: Fine-tuning large pretrained language models (LLMs) is a cornerstone of modern NLP, yet its growing memory demands (driven by backpropagation and large optimizer States) limit deployment in resource-constrained settings. Zero-order (ZO) methods bypass backpropagation by...
Attention Head Entropy of LLMs Predicts Answer Correctness
arXiv:2602.13699v1 Announce Type: new Abstract: Large language models (LLMs) often generate plausible yet incorrect answers, posing risks in safety-critical settings such as medicine. Human evaluation is expensive, and LLM-as-judge approaches risk introducing hidden errors. Recent white-box methods detect contextual hallucinations...
On Representation Redundancy in Large-Scale Instruction Tuning Data Selection
arXiv:2602.13773v1 Announce Type: new Abstract: Data quality is a crucial factor in large language models training. While prior work has shown that models trained on smaller, high-quality datasets can outperform those trained on much larger but noisy or low-quality corpora,...
MEMTS: Internalizing Domain Knowledge via Parameterized Memory for Retrieval-Free Domain Adaptation of Time Series Foundation Models
arXiv:2602.13783v1 Announce Type: new Abstract: While Time Series Foundation Models (TSFMs) have demonstrated exceptional performance in generalized forecasting, their performance often degrades significantly when deployed in real-world vertical domains characterized by temporal distribution shifts and domain-specific periodic structures. Current solutions...
Cast-R1: Learning Tool-Augmented Sequential Decision Policies for Time Series Forecasting
arXiv:2602.13802v1 Announce Type: new Abstract: Time series forecasting has long been dominated by model-centric approaches that formulate prediction as a single-pass mapping from historical observations to future values. Despite recent progress, such formulations often struggle in complex and evolving settings,...
A Multi-Agent Framework for Code-Guided, Modular, and Verifiable Automated Machine Learning
arXiv:2602.13937v1 Announce Type: new Abstract: Automated Machine Learning (AutoML) has revolutionized the development of data-driven solutions; however, traditional frameworks often function as "black boxes", lacking the flexibility and transparency required for complex, real-world engineering tasks. Recent Large Language Model (LLM)-based...