CUDABench: Benchmarking LLMs for Text-to-CUDA Generation
arXiv:2603.02236v1 Announce Type: new Abstract: Recent studies have demonstrated the potential of Large Language Models (LLMs) in generating GPU Kernels. Current benchmarks focus on the translation of high-level languages into CUDA, overlooking the more general and challenging task of text-to-CUDA...
Concept Heterogeneity-aware Representation Steering
arXiv:2603.02237v1 Announce Type: new Abstract: Representation steering offers a lightweight mechanism for controlling the behavior of large language models (LLMs) by intervening on internal activations at inference time. Most existing methods rely on a single global steering direction, typically obtained...
High-order Knowledge Based Network Controllability Robustness Prediction: A Hypergraph Neural Network Approach
arXiv:2603.02265v1 Announce Type: new Abstract: In order to evaluate the invulnerability of networks against various types of attacks and provide guidance for potential performance enhancement as well as controllability maintenance, network controllability robustness (NCR) has attracted increasing attention in recent...
Graph Attention Based Prioritization of Disease Responsible Genes from Multimodal Alzheimer's Network
arXiv:2603.02273v1 Announce Type: new Abstract: Prioritizing disease-associated genes is central to understanding the molecular mechanisms of complex disorders such as Alzheimer's disease (AD). Traditional network-based approaches rely on static centrality measures and often fail to capture cross-modal biological heterogeneity. We...
A Comparative Study of UMAP and Other Dimensionality Reduction Methods
arXiv:2603.02275v1 Announce Type: new Abstract: Uniform Manifold Approximation and Projection (UMAP) is a widely used manifold learning technique for dimensionality reduction. This paper studies UMAP, supervised UMAP, and several competing dimensionality reduction methods, including Principal Component Analysis (PCA), Kernel PCA,...
Quantum-Inspired Fine-Tuning for Few-Shot AIGC Detection via Phase-Structured Reparameterization
arXiv:2603.02281v1 Announce Type: new Abstract: Recent studies show that quantum neural networks (QNNs) generalize well in few-shot regimes. To extend this advantage to large-scale tasks, we propose Q-LoRA, a quantum-enhanced fine-tuning scheme that integrates lightweight QNNs into the low-rank adaptation...
The Malignant Tail: Spectral Segregation of Label Noise in Over-Parameterized Networks
arXiv:2603.02293v1 Announce Type: new Abstract: While implicit regularization facilitates benign overfitting in low-noise regimes, recent theoretical work predicts a sharp phase transition to harmful overfitting as the noise-to-signal ratio increases. We experimentally isolate the geometric mechanism of this transition: the...
Personalized Multi-Agent Average Reward TD-Learning via Joint Linear Approximation
arXiv:2603.02426v1 Announce Type: new Abstract: We study personalized multi-agent average reward TD learning, in which a collection of agents interacts with different environments and jointly learns their respective value functions. We focus on the setting where there exists a shared...
Spectral Regularization for Diffusion Models
arXiv:2603.02447v1 Announce Type: new Abstract: Diffusion models are typically trained using pointwise reconstruction objectives that are agnostic to the spectral and multi-scale structure of natural signals. We propose a loss-level spectral regularization framework that augments standard diffusion training with differentiable...
ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution
arXiv:2603.02510v1 Announce Type: new Abstract: The transition from sequential to parallel computing is essential for modern high-performance applications but is hindered by the steep learning curve of concurrent programming. This challenge is magnified for irregular data structures (such as sparse...
Bridging Diffusion Guidance and Anderson Acceleration via Hopfield Dynamics
arXiv:2603.02531v1 Announce Type: new Abstract: Classifier-Free Guidance (CFG) has significantly enhanced the generative quality of diffusion models by extrapolating between conditional and unconditional outputs. However, its high inference cost and limited applicability to distilled or single-step models have shifted research...
EdgeFLow: Serverless Federated Learning via Sequential Model Migration in Edge Networks
arXiv:2603.02562v1 Announce Type: new Abstract: Federated Learning (FL) has emerged as a transformative distributed learning paradigm in the era of Internet of Things (IoT), reconceptualizing data processing methodologies. However, FL systems face significant communication bottlenecks due to inevitable client-server data...
Birthright citizenship: an empirical analysis of supposedly originalist briefs
Brothers in Law is a recurring series by brothers Akhil and Vikram Amar, with special emphasis on measuring what the Supreme Court says against what the Constitution itself says. For more content from […]The postBirthright citizenship: an empirical analysis of...
Jensen Huang says Nvidia is pulling back from OpenAI and Anthropic, but his explanation raises more questions than it answers
Nvidia CEO Jensen Huang said Wednesday that his company's investments in OpenAI and Anthropic will likely be its last — but his explanation may not tell the whole story.
Decagon completes first tender offer at $4.5B valuation
The AI-powered customer support startup is the latest example of a fast-growing, young company that's providing employee liquidity.
Distribution-Aware Companding Quantization of Large Language Models
arXiv:2603.00364v1 Announce Type: new Abstract: Large language models such as GPT and Llama are trained with a next-token prediction loss. In this work, we suggest that training language models to predict multiple future tokens at once results in higher sample...
A Typologically Grounded Evaluation Framework for Word Order and Morphology Sensitivity in Multilingual Masked LMs
arXiv:2603.00432v1 Announce Type: new Abstract: We introduce a typology-aware diagnostic for multilingual masked language models that tests reliance on word order versus inflectional form. Using Universal Dependencies, we apply inference-time perturbations: full token scrambling, content-word scrambling with function words fixed,...
CIRCUS: Circuit Consensus under Uncertainty via Stability Ensembles
arXiv:2603.00523v1 Announce Type: new Abstract: Mechanistic circuit discovery is notoriously sensitive to arbitrary analyst choices, especially pruning thresholds and feature dictionaries, often yielding brittle "one-shot" explanations with no principled notion of uncertainty. We reframe circuit discovery as an uncertainty-quantification problem...
Super Research: Answering Highly Complex Questions with Large Language Models through Super Deep and Super Wide Research
arXiv:2603.00582v1 Announce Type: new Abstract: While Large Language Models (LLMs) have demonstrated proficiency in Deep Research or Wide Search, their capacity to solve highly complex questions-those requiring long-horizon planning, massive evidence gathering, and synthesis across heterogeneous sources-remains largely unexplored. We...
QQ: A Toolkit for Language Identifiers and Metadata
arXiv:2603.00620v1 Announce Type: new Abstract: The growing number of languages considered in multilingual NLP, including new datasets and tasks, poses challenges regarding properly and accurately reporting which languages are used and how. For example, datasets often use different language identifiers;...
BLUFF: Benchmarking the Detection of False and Synthetic Content across 58 Low-Resource Languages
arXiv:2603.00634v1 Announce Type: new Abstract: Multilingual falsehoods threaten information integrity worldwide, yet detection benchmarks remain confined to English or a few high-resource languages, leaving low-resource linguistic communities without robust defense tools. We introduce BLUFF, a comprehensive benchmark for detecting false...
RAVEL: Reasoning Agents for Validating and Evaluating LLM Text Synthesis
arXiv:2603.00686v1 Announce Type: new Abstract: Large Language Models have evolved from single-round generators into long-horizon agents, capable of complex text synthesis scenarios. However, current evaluation frameworks lack the ability to assess the actual synthesis operations, such as outlining, drafting, and...
DRIV-EX: Counterfactual Explanations for Driving LLMs
arXiv:2603.00696v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used as reasoning engines in autonomous driving, yet their decision-making remains opaque. We propose to study their decision process through counterfactual explanations, which identify the minimal semantic changes to...
LaSTR: Language-Driven Time-Series Segment Retrieval
arXiv:2603.00725v1 Announce Type: new Abstract: Effectively searching time-series data is essential for system analysis, but existing methods often require expert-designed similarity criteria or rely on global, series-level descriptions. We study language-driven segment retrieval: given a natural language query, the goal...
Constitutional Black-Box Monitoring for Scheming in LLM Agents
arXiv:2603.00829v1 Announce Type: new Abstract: Safe deployment of Large Language Model (LLM) agents in autonomous settings requires reliable oversight mechanisms. A central challenge is detecting scheming, where agents covertly pursue misaligned goals. One approach to mitigating such risks is LLM-based...
KVSlimmer: Theoretical Insights and Practical Optimizations for Asymmetric KV Merging
arXiv:2603.00907v1 Announce Type: new Abstract: The growing computational and memory demands of the Key-Value (KV) cache significantly limit the ability of Large Language Models (LLMs). While KV merging has emerged as a promising solution, existing methods that rely on empirical...
Towards Orthographically-Informed Evaluation of Speech Recognition Systems for Indian Languages
arXiv:2603.00941v1 Announce Type: new Abstract: Evaluating ASR systems for Indian languages is challenging due to spelling variations, suffix splitting flexibility, and non-standard spellings in code-mixed words. Traditional Word Error Rate (WER) often presents a bleaker picture of system performance than...
How RL Unlocks the Aha Moment in Geometric Interleaved Reasoning
arXiv:2603.01070v1 Announce Type: new Abstract: Solving complex geometric problems inherently requires interleaved reasoning: a tight alternation between constructing diagrams and performing logical deductions. Although recent Multimodal Large Language Models (MLLMs) have demonstrated strong capabilities in visual generation and plotting, we...