Task Expansion and Cross Refinement for Open-World Conditional Modeling
arXiv:2603.13308v1 Announce Type: new Abstract: Open-world conditional modeling (OCM), requires a single model to answer arbitrary conditional queries across heterogeneous datasets, where observed variables and targets vary and arise from a vast open-ended task universe. Because any finite collection of...
Neural Approximation and Its Applications
arXiv:2603.13311v1 Announce Type: new Abstract: Multivariate function approximation is a fundamental problem in machine learning. Classic multivariate function approximations rely on hand-crafted basis functions (e.g., polynomial basis and Fourier basis), which limits their approximation ability and data adaptation ability, resulting...
Modular Neural Computer
arXiv:2603.13323v1 Announce Type: new Abstract: This paper introduces the Modular Neural Computer (MNC), a memory-augmented neural architecture for exact algorithmic computation on variable-length inputs. The model combines an external associative memory of scalar cells, explicit read and write heads, a...
Marked Pedagogies: Examining Linguistic Biases in Personalized Automated Writing Feedback
arXiv:2603.12471v1 Announce Type: new Abstract: Effective personalized feedback is critical to students' literacy development. Though LLM-powered tools now promise to automate such feedback at scale, LLMs are not language-neutral: they privilege standard academic English and reproduce social stereotypes, raising concerns...
Spatial PDE-aware Selective State-space with Nested Memory for Mobile Traffic Grid Forecasting
arXiv:2603.12353v1 Announce Type: new Abstract: Traffic forecasting in cellular networks is a challenging spatiotemporal prediction problem due to strong temporal dependencies, spatial heterogeneity across cells, and the need for scalability to large network deployments. Traditional cell-specific models incur prohibitive training...
Bases of Steerable Kernels for Equivariant CNNs: From 2D Rotations to the Lorentz Group
arXiv:2603.12459v1 Announce Type: new Abstract: We present an alternative way of solving the steerable kernel constraint that appears in the design of steerable equivariant convolutional neural networks. We find explicit real and complex bases which are ready to use, for...
Deep Distance Measurement Method for Unsupervised Multivariate Time Series Similarity Retrieval
arXiv:2603.12544v1 Announce Type: new Abstract: We propose the Deep Distance Measurement Method (DDMM) to improve retrieval accuracy in unsupervised multivariate time series similarity retrieval. DDMM enables learning of minute differences within states in the entire time series and thereby recognition...
Structure-Aware Epistemic Uncertainty Quantification for Neural Operator PDE Surrogates
arXiv:2603.11052v1 Announce Type: new Abstract: Neural operators (NOs) provide fast, resolution-invariant surrogates for mapping input fields to PDE solution fields, but their predictions can exhibit significant epistemic uncertainty due to finite data, imperfect optimization, and distribution shift. For practical deployment...
High-resolution weather-guided surrogate modeling for data-efficient cross-location building energy prediction
arXiv:2603.11121v1 Announce Type: new Abstract: Building design optimization often depends on physics-based simulation tools such as EnergyPlus, which, although accurate, are computationally expensive and slow. Surrogate models provide a faster alternative, yet most are location-specific, and even weather-informed variants require...
Algorithmic Capture, Computational Complexity, and Inductive Bias of Infinite Transformers
arXiv:2603.11161v1 Announce Type: new Abstract: We formally define Algorithmic Capture (i.e., ``grokking'' an algorithm) as the ability of a neural network to generalize to arbitrary problem sizes ($T$) with controllable error and minimal sample adaptation, distinguishing true algorithmic learning from...
UniHetCO: A Unified Heterogeneous Representation for Multi-Problem Learning in Unsupervised Neural Combinatorial Optimization
arXiv:2603.11456v1 Announce Type: new Abstract: Unsupervised neural combinatorial optimization (NCO) offers an appealing alternative to supervised approaches by training learning-based solvers without ground-truth solutions, directly minimizing instance objectives and constraint violations. Yet for graph node subset-selection problems (e.g., Maximum Clique...
The Prediction-Measurement Gap: Toward Meaning Representations as Scientific Instruments
arXiv:2603.10130v1 Announce Type: new Abstract: Text embeddings have become central to computational social science and psychology, enabling scalable measurement of meaning and mixed-method inference. Yet most representation learning is optimized and evaluated for prediction and retrieval, yielding a prediction-measurement gap:...
ViDia2Std: A Parallel Corpus and Methods for Low-Resource Vietnamese Dialect-to-Standard Translation
arXiv:2603.10211v1 Announce Type: new Abstract: Vietnamese exhibits extensive dialectal variation, posing challenges for NLP systems trained predominantly on standard Vietnamese. Such systems often underperform on dialectal inputs, especially from underrepresented Central and Southern regions. Previous work on dialect normalization has...
Training Language Models via Neural Cellular Automata
arXiv:2603.10055v1 Announce Type: new Abstract: Pre-training is crucial for large language models (LLMs), as it is when most representations and capabilities are acquired. However, natural language pre-training has problems: high-quality text is finite, it contains human biases, and it entangles...
A Survey of Weight Space Learning: Understanding, Representation, and Generation
arXiv:2603.10090v1 Announce Type: new Abstract: Neural network weights are typically viewed as the end product of training, while most deep learning research focuses on data, features, and architectures. However, recent advances show that the set of all possible weight values...
AI Act Evaluation Benchmark: An Open, Transparent, and Reproducible Evaluation Dataset for NLP and RAG Systems
arXiv:2603.09435v1 Announce Type: new Abstract: The rapid rollout of AI in heterogeneous public and societal sectors has subsequently escalated the need for compliance with regulatory standards and frameworks. The EU AI Act has emerged as a landmark in the regulatory...
Curveball Steering: The Right Direction To Steer Isn't Always Linear
arXiv:2603.09313v1 Announce Type: new Abstract: Activation steering is a widely used approach for controlling large language model (LLM) behavior by intervening on internal representations. Existing methods largely rely on the Linear Representation Hypothesis, assuming behavioral attributes can be manipulated using...
Quantifying the Necessity of Chain of Thought through Opaque Serial Depth
arXiv:2603.09786v1 Announce Type: new Abstract: Large language models (LLMs) tend to externalize their reasoning in their chain of thought, making the chain of thought a good target for monitoring. This is partially an inherent feature of the Transformer architecture: sufficiently...
Generalized Reduction to the Isotropy for Flexible Equivariant Neural Fields
arXiv:2603.08758v1 Announce Type: new Abstract: Many geometric learning problems require invariants on heterogeneous product spaces, i.e., products of distinct spaces carrying different group actions, where standard techniques do not directly apply. We show that, when a group $G$ acts transitively...
Uncovering a Winning Lottery Ticket with Continuously Relaxed Bernoulli Gates
arXiv:2603.08914v1 Announce Type: new Abstract: Over-parameterized neural networks incur prohibitive memory and computational costs for resource-constrained deployment. The Strong Lottery Ticket (SLT) hypothesis suggests that randomly initialized networks contain sparse subnetworks achieving competitive accuracy without weight training. Existing SLT methods,...
Reforming the Mechanism: Editing Reasoning Patterns in LLMs with Circuit Reshaping
arXiv:2603.06923v1 Announce Type: new Abstract: Large language models (LLMs) often exhibit flawed reasoning ability that undermines reliability. Existing approaches to improving reasoning typically treat it as a general and monolithic skill, applying broad training which is inefficient and unable to...
Switchable Activation Networks
arXiv:2603.06601v1 Announce Type: new Abstract: Deep neural networks, and more recently large-scale generative models such as large language models (LLMs) and large vision-action models (LVAs), achieve remarkable performance across diverse domains, yet their prohibitive computational cost hinders deployment in resource-constrained...
Geodesic Gradient Descent: A Generic and Learning-rate-free Optimizer on Objective Function-induced Manifolds
arXiv:2603.06651v1 Announce Type: new Abstract: Euclidean gradient descent algorithms barely capture the geometry of objective function-induced hypersurfaces and risk driving update trajectories off the hypersurfaces. Riemannian gradient descent algorithms address these issues but fail to represent complex hypersurfaces via a...
Rank-Factorized Implicit Neural Bias: Scaling Super-Resolution Transformer with FlashAttention
arXiv:2603.06738v1 Announce Type: new Abstract: Recent Super-Resolution~(SR) methods mainly adopt Transformers for their strong long-range modeling capability and exceptional representational capacity. However, most SR Transformers rely heavily on relative positional bias~(RPB), which prevents them from leveraging hardware-efficient attention kernels such...
Natural Language, Legal Hurdles: Navigating the Complexities in Natural Language Processing Development and Application
This article delves into the legal challenges faced in developing and deploying Natural Language Processing (NLP) technologies, focusing particularly on the European Union’s legal framework, especially the DSM Directive, the InfoSoc Directive, and the Artificial Intelligence Act. It addresses the...
Certifying Legal AI Assistants for Unrepresented Litigants: A Global Survey of Access to Civil Justice, Unauthorized Practice of Law, and AI
The global integration of artificial intelligence (AI) into legal services has created a critical need for clarity regarding unauthorized practice of law (UPL) rules. Traditionally, UPL rules prohibited unlicensed individuals from engaging in activities legally reserved for qualified attorneys, including,...
Predictive policing and algorithmic fairness
Abstract This paper examines racial discrimination and algorithmic bias in predictive policing algorithms (PPAs), an emerging technology designed to predict threats and suggest solutions in law enforcement. We first describe what discrimination is in a case study of Chicago’s PPA....
Algorithmic Unfairness through the Lens of EU Non-Discrimination Law
Concerns regarding unfairness and discrimination in the context of artificial intelligence (AI) systems have recently received increased attention from both legal and computer science scholars. Yet, the degree of overlap between notions of algorithmic bias and fairness on the one...
Copyright and AI training data—transparency to the rescue?
Abstract Generative Artificial Intelligence (AI) models must be trained on vast quantities of data, much of which is composed of copyrighted material. However, AI developers frequently use such content without seeking permission from rightsholders, leading to calls for requirements to...
Civil law regulation of artificial Intelligence in the Russian Federation
The purpose of this article is to identify the normative gaps in the legal regulation of the use of artificial intelligence technology and related systems, as well as to identify the degree of need for a more comprehensive legal regulation....