A Hierarchical Multi-Agent System for Autonomous Discovery in Geoscientific Data Archives
arXiv:2602.21351v1 Announce Type: new Abstract: The rapid accumulation of Earth science data has created a significant scalability challenge; while repositories like PANGAEA host vast collections of datasets, citation metrics indicate that a substantial portion remains underutilized, limiting data reusability. Here...
Digital Sovereignty: How Nations Are Asserting Control Over Technology Infrastructure
Countries worldwide are implementing digital sovereignty measures to control data flows, technology standards, and digital infrastructure within their borders.
Sharp Convergence Rates for Masked Diffusion Models
arXiv:2602.22505v1 Announce Type: new Abstract: Discrete diffusion models have achieved strong empirical performance in text and other symbolic domains, with masked (absorbing-rate) variants emerging as competitive alternatives to autoregressive models. Among existing samplers, the Euler method remains the standard choice...
Copyright Protection for AI-Generated Works
Since the 2010s, artificial intelligence (AI) has quickly grown from another subset of machine learning (ie deep learning) in particular with recent advances in generative AI, such as ChatGPT. The use of generative AI has gone beyond leisure purposes. It...
Precision Medicine and Data Privacy: Balancing Innovation with Patient Rights
The rapid advancement of precision medicine creates unprecedented opportunities for personalized treatment while raising complex data privacy and consent challenges.
Interleaved Head Attention
arXiv:2602.21371v1 Announce Type: new Abstract: Multi-Head Attention (MHA) is the core computational primitive underlying modern Large Language Models (LLMs). However, MHA suffers from a fundamental linear scaling limitation: $H$ attention heads produce exactly $H$ independent attention matrices, with no communication...
MINAR: Mechanistic Interpretability for Neural Algorithmic Reasoning
arXiv:2602.21442v1 Announce Type: new Abstract: The recent field of neural algorithmic reasoning (NAR) studies the ability of graph neural networks (GNNs) to emulate classical algorithms like Bellman-Ford, a phenomenon known as algorithmic alignment. At the same time, recent advances in...
Geometric Priors for Generalizable World Models via Vector Symbolic Architecture
arXiv:2602.21467v1 Announce Type: new Abstract: A key challenge in artificial intelligence and neuroscience is understanding how neural systems learn representations that capture the underlying dynamics of the world. Most world models represent the transition function with unstructured neural networks, limiting...
Enhancing Hate Speech Detection on Social Media: A Comparative Analysis of Machine Learning Models and Text Transformation Approaches
arXiv:2602.20634v1 Announce Type: new Abstract: The proliferation of hate speech on social media platforms has necessitated the development of effective detection and moderation tools. This study evaluates the efficacy of various machine learning models in identifying hate speech and offensive...
RMIT-ADM+S at the MMU-RAG NeurIPS 2025 Competition
arXiv:2602.20735v1 Announce Type: cross Abstract: This paper presents the award-winning RMIT-ADM+S system for the Text-to-Text track of the NeurIPS~2025 MMU-RAG Competition. We introduce Routing-to-RAG (R2RAG), a research-focused retrieval-augmented generation (RAG) architecture composed of lightweight components that dynamically adapt the retrieval...
FedAvg-Based CTMC Hazard Model for Federated Bridge Deterioration Assessment
arXiv:2602.20194v1 Announce Type: new Abstract: Bridge periodic inspection records contain sensitive information about public infrastructure, making cross-organizational data sharing impractical under existing data governance constraints. We propose a federated framework for estimating a Continuous-Time Markov Chain (CTMC) hazard model of...
GeoPT: Scaling Physics Simulation via Lifted Geometric Pre-Training
arXiv:2602.20399v1 Announce Type: new Abstract: Neural simulators promise efficient surrogates for physics simulation, but scaling them is bottlenecked by the prohibitive cost of generating high-fidelity training data. Pre-training on abundant off-the-shelf geometries offers a natural alternative, yet faces a fundamental...
Do LLMs and VLMs Share Neurons for Inference? Evidence and Mechanisms of Cross-Modal Transfer
arXiv:2602.19058v1 Announce Type: new Abstract: Large vision-language models (LVLMs) have rapidly advanced across various domains, yet they still lag behind strong text-only large language models (LLMs) on tasks that require multi-step inference and compositional decision-making. Motivated by their shared transformer...
Weak-Form Evolutionary Kolmogorov-Arnold Networks for Solving Partial Differential Equations
arXiv:2602.18515v1 Announce Type: new Abstract: Partial differential equations (PDEs) form a central component of scientific computing. Among recent advances in deep learning, evolutionary neural networks have been developed to successively capture the temporal dynamics of time-dependent PDEs via parameter evolution....
Neural Synchrony Between Socially Interacting Language Models
arXiv:2602.17815v1 Announce Type: new Abstract: Neuroscience has uncovered a fundamental mechanism of our social nature: human brain activity becomes synchronized with others in many social contexts involving interaction. Traditionally, social minds have been regarded as an exclusive property of living...
How Vision Becomes Language: A Layer-wise Information-Theoretic Analysis of Multimodal Reasoning
arXiv:2602.15580v1 Announce Type: new Abstract: When a multimodal Transformer answers a visual question, is the prediction driven by visual evidence, linguistic reasoning, or genuinely fused cross-modal computation -- and how does this structure evolve across layers? We address this question...
ExpertWeaver: Unlocking the Inherent MoE in Dense LLMs with GLU Activation Patterns
arXiv:2602.15521v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) effectively scales model capacity while preserving computational efficiency through sparse expert activation. However, training high-quality MoEs from scratch is prohibitively expensive. A promising alternative is to convert pretrained dense models into sparse MoEs....
Causally-Guided Automated Feature Engineering with Multi-Agent Reinforcement Learning
arXiv:2602.16435v1 Announce Type: new Abstract: Automated feature engineering (AFE) enables AI systems to autonomously construct high-utility representations from raw tabular data. However, existing AFE methods rely on statistical heuristics, yielding brittle features that fail under distribution shift. We introduce CAFE,...
Language Model Representations for Efficient Few-Shot Tabular Classification
arXiv:2602.15844v1 Announce Type: cross Abstract: The Web is a rich source of structured data in the form of tables, from product catalogs and knowledge bases to scientific datasets. However, the heterogeneity of the structure and semantics of these tables makes...
Epistemology of Generative AI: The Geometry of Knowing
arXiv:2602.17116v1 Announce Type: new Abstract: Generative AI presents an unprecedented challenge to our understanding of knowledge and its production. Unlike previous technological transformations, where engineering understanding preceded or accompanied deployment, generative AI operates through mechanisms whose epistemic character remains obscure,...
When Semantic Overlap Is Not Enough: Cross-Lingual Euphemism Transfer Between Turkish and English
arXiv:2602.16957v1 Announce Type: new Abstract: Euphemisms substitute socially sensitive expressions, often softening or reframing meaning, and their reliance on cultural and pragmatic context complicates modeling across languages. In this study, we investigate how cross-lingual equivalence influences transfer in multilingual euphemism...
Entropy-Based Data Selection for Language Models
arXiv:2602.17465v1 Announce Type: new Abstract: Modern language models (LMs) increasingly require two critical resources: computational resources and data resources. Data selection techniques can effectively reduce the amount of training data required for fine-tuning LMs. However, their effectiveness is closely related...
Efficient Tail-Aware Generative Optimization via Flow Model Fine-Tuning
arXiv:2602.16796v1 Announce Type: new Abstract: Fine-tuning pre-trained diffusion and flow models to optimize downstream utilities is central to real-world deployment. Existing entropy-regularized methods primarily maximize expected reward, providing no mechanism to shape tail behavior. However, tail control is often essential:...
Formal Mechanistic Interpretability: Automated Circuit Discovery with Provable Guarantees
arXiv:2602.16823v1 Announce Type: new Abstract: *Automated circuit discovery* is a central tool in mechanistic interpretability for identifying the internal components of neural networks responsible for specific behaviors. While prior methods have made significant progress, they typically depend on heuristics or...
Beyond Message Passing: A Symbolic Alternative for Expressive and Interpretable Graph Learning
arXiv:2602.16947v1 Announce Type: new Abstract: Graph Neural Networks (GNNs) have become essential in high-stakes domains such as drug discovery, yet their black-box nature remains a significant barrier to trustworthiness. While self-explainable GNNs attempt to bridge this gap, they often rely...
Input out, output in: towards positive-sum solutions to AI-copyright tensions
Abstract This article addresses the legal tensions between artificial intelligence (AI) development and copyright law, exploring policymaking on the use of copyrighted data for AI training at the input level and the generation of AI content at the output level....
Distributed physics-informed neural networks via domain decomposition for fast flow reconstruction
arXiv:2602.15883v1 Announce Type: new Abstract: Physics-Informed Neural Networks (PINNs) offer a powerful paradigm for flow reconstruction, seamlessly integrating sparse velocity measurements with the governing Navier-Stokes equations to recover complete velocity and latent pressure fields. However, scaling such models to large...
MolCrystalFlow: Molecular Crystal Structure Prediction via Flow Matching
arXiv:2602.16020v1 Announce Type: new Abstract: Molecular crystal structure prediction represents a grand challenge in computational chemistry due to large sizes of constituent molecules and complex intra- and intermolecular interactions. While generative modeling has revolutionized structure discovery for molecules, inorganic solids,...
Avey-B
arXiv:2602.15814v1 Announce Type: new Abstract: Compact pretrained bidirectional encoders remain the backbone of industrial NLP under tight compute and memory budgets. Their effectiveness stems from self-attention's ability to deliver high-quality bidirectional contextualization with sequence-level parallelism, as popularized by BERT-style architectures....
Size Transferability of Graph Transformers with Convolutional Positional Encodings
arXiv:2602.15239v1 Announce Type: new Abstract: Transformers have achieved remarkable success across domains, motivating the rise of Graph Transformers (GTs) as attention-based architectures for graph-structured data. A key design choice in GTs is the use of Graph Neural Network (GNN)-based positional...