Duel-Evolve: Reward-Free Test-Time Scaling via LLM Self-Preferences
arXiv:2602.21585v1 Announce Type: new Abstract: Many applications seek to optimize LLM outputs at test time by iteratively proposing, scoring, and refining candidates over a discrete output space. Existing methods use a calibrated scalar evaluator for the target objective to guide...
ABM-UDE: Developing Surrogates for Epidemic Agent-Based Models via Scientific Machine Learning
arXiv:2602.21588v1 Announce Type: new Abstract: Agent-based epidemic models (ABMs) encode behavioral and policy heterogeneity but are too slow for nightly hospital planning. We develop county-ready surrogates that learn directly from exascale ABM trajectories using Universal Differential Equations (UDEs): mechanistic SEIR-family...
The Beginnings Of The One Big Beautiful Bill Act: Placing The 2017 Tax Cuts And Jobs Act In Historical Perspective
On July 4, 2025, President Donald J. Trump signed into law the One Big Beautiful Bill Act (OBBBA). This new law was built on the foundations of its immediate predecessor,the 2017 Tax Cuts and Jobs Act (TCJA). This Essay examines...
Enhancing Hate Speech Detection on Social Media: A Comparative Analysis of Machine Learning Models and Text Transformation Approaches
arXiv:2602.20634v1 Announce Type: new Abstract: The proliferation of hate speech on social media platforms has necessitated the development of effective detection and moderation tools. This study evaluates the efficacy of various machine learning models in identifying hate speech and offensive...
Semantic Novelty at Scale: Narrative Shape Taxonomy and Readership Prediction in 28,606 Books
arXiv:2602.20647v1 Announce Type: new Abstract: I introduce semantic novelty--cosine distance between each paragraph's sentence embedding and the running centroid of all preceding paragraphs--as an information-theoretic measure of narrative structure at corpus scale. Applying it to 28,606 books in PG19 (pre-1920...
ID-LoRA: Efficient Low-Rank Adaptation Inspired by Matrix Interpolative Decomposition
arXiv:2602.20727v1 Announce Type: new Abstract: LoRA has become a universal Parameter-Efficient Fine-Tuning (PEFT) technique that equips Large Language Models (LLMs) to adapt quickly to new tasks. However, when these models are scaled up, even the latest LoRA variants still introduce...
Adaptive Text Anonymization: Learning Privacy-Utility Trade-offs via Prompt Optimization
arXiv:2602.20743v1 Announce Type: new Abstract: Anonymizing textual documents is a highly context-sensitive problem: the appropriate balance between privacy protection and utility preservation varies with the data domain, privacy objectives, and downstream application. However, existing anonymization methods rely on static, manually...
Explicit Grammar Semantic Feature Fusion for Robust Text Classification
arXiv:2602.20749v1 Announce Type: new Abstract: Natural Language Processing enables computers to understand human language by analysing and classifying text efficiently with deep-level grammatical and semantic features. Existing models capture features by learning from large corpora with transformer models, which are...
MoBiQuant: Mixture-of-Bits Quantization for Token-Adaptive Elastic LLMs
arXiv:2602.20191v1 Announce Type: cross Abstract: Changing runtime complexity on cloud and edge devices necessitates elastic large language model (LLM) deployment, where an LLM can be inferred with various quantization precisions based on available computational resources. However, it has been observed...
Protein Language Models Diverge from Natural Language: Comparative Analysis and Improved Inference
arXiv:2602.20449v1 Announce Type: cross Abstract: Modern Protein Language Models (PLMs) apply transformer-based model architectures from natural language processing to biological sequences, predicting a variety of protein functions and properties. However, protein language has key differences from natural language, such as...
HiSAC: Hierarchical Sparse Activation Compression for Ultra-long Sequence Modeling in Recommenders
arXiv:2602.21009v1 Announce Type: cross Abstract: Modern recommender systems leverage ultra-long user behavior sequences to capture dynamic preferences, but end-to-end modeling is infeasible in production due to latency and memory constraints. While summarizing history via interest centers offers a practical alternative,...
Controllable Exploration in Hybrid-Policy RLVR for Multi-Modal Reasoning
arXiv:2602.20197v1 Announce Type: new Abstract: Reinforcement Learning with verifiable rewards (RLVR) has emerged as a primary learning paradigm for enhancing the reasoning capabilities of multi-modal large language models (MLLMs). However, during RL training, the enormous state space of MLLM and...
QuantVLA: Scale-Calibrated Post-Training Quantization for Vision-Language-Action Models
arXiv:2602.20309v1 Announce Type: new Abstract: Vision-language-action (VLA) models unify perception, language, and control for embodied agents but face significant challenges in practical deployment due to rapidly increasing compute and memory demands, especially as models scale to longer horizons and larger...
cc-Shapley: Measuring Multivariate Feature Importance Needs Causal Context
arXiv:2602.20396v1 Announce Type: new Abstract: Explainable artificial intelligence promises to yield insights into relevant features, thereby enabling humans to examine and scrutinize machine learning models or even facilitating scientific discovery. Considering the widespread technique of Shapley values, we find that...
Wasserstein Distributionally Robust Online Learning
arXiv:2602.20403v1 Announce Type: new Abstract: We study distributionally robust online learning, where a risk-averse learner updates decisions sequentially to guard against worst-case distributions drawn from a Wasserstein ambiguity set centered at past observations. While this paradigm is well understood in...
$\kappa$-Explorer: A Unified Framework for Active Model Estimation in MDPs
arXiv:2602.20404v1 Announce Type: new Abstract: In tabular Markov decision processes (MDPs) with perfect state observability, each trajectory provides active samples from the transition distributions conditioned on state-action pairs. Consequently, accurate model estimation depends on how the exploration policy allocates visitation...
A Long-Short Flow-Map Perspective for Drifting Models
arXiv:2602.20463v1 Announce Type: new Abstract: This paper provides a reinterpretation of the Drifting Model~\cite{deng2026generative} through a semigroup-consistent long-short flow-map factorization. We show that a global transport process can be decomposed into a long-horizon flow map followed by a short-time terminal...
Elimination-compensation pruning for fully-connected neural networks
arXiv:2602.20467v1 Announce Type: new Abstract: The unmatched ability of Deep Neural Networks in capturing complex patterns in large and noisy datasets is often associated with their large hypothesis space, and consequently to the vast amount of parameters that characterize model...
CGSTA: Cross-Scale Graph Contrast with Stability-Aware Alignment for Multivariate Time-Series Anomaly Detection
arXiv:2602.20468v1 Announce Type: new Abstract: Multivariate time-series anomaly detection is essential for reliable industrial control, telemetry, and service monitoring. However, the evolving inter-variable dependencies and inevitable noise render it challenging. Existing methods often use single-scale graphs or instance-level contrast. Moreover,...
3 days left: Save up to $680 on your TechCrunch Disrupt 2026 ticket
Just 3 days left to save up to $680 on your TechCrunch Disrupt 2026 ticket. Offer ends on Friday, February 27 at 11:59 p.m. PT. Don't miss unparalleled, curated networking and valuable insights from 250+ tech leaders, and discover 300+...
Do LLMs and VLMs Share Neurons for Inference? Evidence and Mechanisms of Cross-Modal Transfer
arXiv:2602.19058v1 Announce Type: new Abstract: Large vision-language models (LVLMs) have rapidly advanced across various domains, yet they still lag behind strong text-only large language models (LLMs) on tasks that require multi-step inference and compositional decision-making. Motivated by their shared transformer...
TriTopic: Tri-Modal Graph-Based Topic Modeling with Iterative Refinement and Archetypes
arXiv:2602.19079v1 Announce Type: new Abstract: Topic modeling extracts latent themes from large text collections, but leading approaches like BERTopic face critical limitations: stochastic instability, loss of lexical precision ("Embedding Blur"), and reliance on a single data perspective. We present TriTopic,...
Value Entanglement: Conflation Between Different Kinds of Good In (Some) Large Language Models
arXiv:2602.19101v1 Announce Type: new Abstract: Value alignment of Large Language Models (LLMs) requires us to empirically measure these models' actual, acquired representation of value. Among the characteristics of value representation in humans is that they distinguish among value of different...
Astra: Activation-Space Tail-Eigenvector Low-Rank Adaptation of Large Language Models
arXiv:2602.19111v1 Announce Type: new Abstract: Parameter-Efficient Fine-Tuning (PEFT) methods, especially LoRA, are widely used for adapting pre-trained models to downstream tasks due to their computational and storage efficiency. However, in the context of LoRA and its variants, the potential of...
Facet-Level Persona Control by Trait-Activated Routing with Contrastive SAE for Role-Playing LLMs
arXiv:2602.19157v1 Announce Type: new Abstract: Personality control in Role-Playing Agents (RPAs) is commonly achieved via training-free methods that inject persona descriptions and memory through prompts or retrieval-augmented generation, or via supervised fine-tuning (SFT) on persona-specific corpora. While SFT can be...
Anatomy of Agentic Memory: Taxonomy and Empirical Analysis of Evaluation and System Limitations
arXiv:2602.19320v1 Announce Type: new Abstract: Agentic memory systems enable large language model (LLM) agents to maintain state across long interactions, supporting long-horizon reasoning and personalization beyond fixed context windows. Despite rapid architectural development, the empirical foundations of these systems remain...
Weak-Form Evolutionary Kolmogorov-Arnold Networks for Solving Partial Differential Equations
arXiv:2602.18515v1 Announce Type: new Abstract: Partial differential equations (PDEs) form a central component of scientific computing. Among recent advances in deep learning, evolutionary neural networks have been developed to successively capture the temporal dynamics of time-dependent PDEs via parameter evolution....
The Geometry of Multi-Task Grokking: Transverse Instability, Superposition, and Weight Decay Phase Structure
arXiv:2602.18523v1 Announce Type: new Abstract: Grokking -- the abrupt transition from memorization to generalization long after near-zero training loss -- has been studied mainly in single-task settings. We extend geometric analysis to multi-task modular arithmetic, training shared-trunk Transformers on dual-task...
GIST: Targeted Data Selection for Instruction Tuning via Coupled Optimization Geometry
arXiv:2602.18584v1 Announce Type: new Abstract: Targeted data selection has emerged as a crucial paradigm for efficient instruction tuning, aiming to identify a small yet influential subset of training examples for a specific target task. In practice, influence is often measured...
Learning Invariant Visual Representations for Planning with Joint-Embedding Predictive World Models
arXiv:2602.18639v1 Announce Type: new Abstract: World models learned from high-dimensional visual observations allow agents to make decisions and plan directly in latent space, avoiding pixel-level reconstruction. However, recent latent predictive architectures (JEPAs), including the DINO world model (DINO-WM), display a...