HoloByte: Continuous Hyperspherical Distillation for Tokenizer-Free Modeling
arXiv:2603.16917v1 Announce Type: new Abstract: Sequence modeling universally relies on discrete subword tokenization to circumvent the $\mathcal{O}(N^2)$ computational intractability of native byte-level attention. However, this heuristic quantization imposes artificial morphological boundaries, enforces vocabulary dependence, and fractures the continuity of the...
SCE-LITE-HQ: Smooth visual counterfactual explanations with generative foundation models
arXiv:2603.17048v1 Announce Type: new Abstract: Modern neural networks achieve strong performance but remain difficult to interpret in high-dimensional visual domains. Counterfactual explanations (CFEs) provide a principled approach to interpreting black-box predictions by identifying minimal input changes that alter model outputs....
SENSE: Efficient EEG-to-Text via Privacy-Preserving Semantic Retrieval
arXiv:2603.17109v1 Announce Type: new Abstract: Decoding brain activity into natural language is a major challenge in AI with important applications in assistive communication, neurotechnology, and human-computer interaction. Most existing Brain-Computer Interface (BCI) approaches rely on memory-intensive fine-tuning of Large Language...
Contextual Preference Distribution Learning
arXiv:2603.17139v1 Announce Type: new Abstract: Decision-making problems often feature uncertainty stemming from heterogeneous and context-dependent human preferences. To address this, we propose a sequential learning-and-optimization pipeline to learn preference distributions and leverage them to solve downstream problems, for example risk-averse...
Binary Latent Protein Fitness Landscapes for Quantum Annealing Optimization
arXiv:2603.17247v1 Announce Type: new Abstract: We propose Q-BIOLAT, a framework for modeling and optimizing protein fitness landscapes in binary latent spaces. Starting from protein sequences, we leverage pretrained protein language models to obtain continuous embeddings, which are then transformed into...
Variational Kernel Design for Internal Noise: Gaussian Chaos Noise, Representation Compatibility, and Reliable Deep Learning
arXiv:2603.17365v1 Announce Type: new Abstract: Internal noise in deep networks is usually inherited from heuristics such as dropout, hard masking, or additive perturbation. We ask two questions: what correlation geometry should internal noise have, and is the implemented perturbation compatible...
Auto-Unrolled Proximal Gradient Descent: An AutoML Approach to Interpretable Waveform Optimization
arXiv:2603.17478v1 Announce Type: new Abstract: This study explores the combination of automated machine learning (AutoML) with model-based deep unfolding (DU) for optimizing wireless beamforming and waveforms. We convert the iterative proximal gradient descent (PGD) algorithm into a deep neural network,...
A Dynamic Survey of Fuzzy, Intuitionistic Fuzzy, Neutrosophic, Plithogenic, and Extensional Sets
arXiv:2603.15667v1 Announce Type: new Abstract: Real-world phenomena often exhibit vagueness, partial truth, and incomplete information. To model such uncertainty in a mathematically rigorous way, many generalized set-theoretic frameworks have been introduced, including Fuzzy Sets [1], Intuitionistic Fuzzy Sets [2], Neutrosophic...
ARISE: Agent Reasoning with Intrinsic Skill Evolution in Hierarchical Reinforcement Learning
arXiv:2603.16060v1 Announce Type: new Abstract: The dominant paradigm for improving mathematical reasoning in language models relies on Reinforcement Learning with verifiable rewards. Yet existing methods treat each problem instance in isolation without leveraging the reusable strategies that emerge and accumulate...
SciZoom: A Large-scale Benchmark for Hierarchical Scientific Summarization across the LLM Era
arXiv:2603.16131v1 Announce Type: new Abstract: The explosive growth of AI research has created unprecedented information overload, increasing the demand for scientific summarization at multiple levels of granularity beyond traditional abstracts. While LLMs are increasingly adopted for summarization, existing benchmarks remain...
Attribution-Guided Model Rectification of Unreliable Neural Network Behaviors
arXiv:2603.15656v1 Announce Type: new Abstract: The performance of neural network models deteriorates due to their unreliable behavior on non-robust features of corrupted samples. Owing to their opaque nature, rectifying models to address this problem often necessitates arduous data cleaning and...
Flood Risk Follows Valleys, Not Grids: Graph Neural Networks for Flash Flood Susceptibility Mapping in Himachal Pradesh with Conformal Uncertainty Quantification
arXiv:2603.15681v1 Announce Type: new Abstract: Flash floods are the most destructive natural hazard in Himachal Pradesh (HP), India, causing over 400 fatalities and $1.2 billion in losses in the 2023 monsoon season alone. Existing risk maps treat every pixel independently,...
The Agentic Researcher: A Practical Guide to AI-Assisted Research in Mathematics and Machine Learning
arXiv:2603.15914v1 Announce Type: new Abstract: AI tools and agents are reshaping how researchers work, from proving theorems to training neural networks. Yet for many, it remains unclear how these tools fit into everyday research practice. This paper is a practical...
Generative Inverse Design with Abstention via Diagonal Flow Matching
arXiv:2603.15925v1 Announce Type: new Abstract: Inverse design aims to find design parameters $x$ achieving target performance $y^*$. Generative approaches learn bidirectional mappings between designs and labels, enabling diverse solution sampling. However, standard conditional flow matching (CFM), when adapted to inverse...
The ARC of Progress towards AGI: A Living Survey of Abstraction and Reasoning
arXiv:2603.13372v1 Announce Type: new Abstract: The Abstraction and Reasoning Corpus (ARC-AGI) has become a key benchmark for fluid intelligence in AI. This survey presents the first cross-generation analysis of 82 approaches across three benchmark versions and the ARC Prize 2024-2025...
DyACE: Dynamic Algorithm Co-evolution for Online Automated Heuristic Design with Large Language Model
arXiv:2603.13344v1 Announce Type: new Abstract: The prevailing paradigm in Automated Heuristic Design (AHD) typically relies on the assumption that a single, fixed algorithm can effectively navigate the shifting dynamics of a combinatorial search. This static approach often proves inadequate for...
ICaRus: Identical Cache Reuse for Efficient Multi Model Inference
arXiv:2603.13281v1 Announce Type: new Abstract: Multi model inference has recently emerged as a prominent paradigm, particularly in the development of agentic AI systems. However, in such scenarios, each model must maintain its own Key-Value (KV) cache for the identical prompt,...
RelayCaching: Accelerating LLM Collaboration via Decoding KV Cache Reuse
arXiv:2603.13289v1 Announce Type: new Abstract: The increasing complexity of AI tasks has shifted the paradigm from monolithic models toward multi-agent large language model (LLM) systems. However, these collaborative architectures introduce a critical bottleneck: redundant prefill computation for shared content generated...
Neural Approximation and Its Applications
arXiv:2603.13311v1 Announce Type: new Abstract: Multivariate function approximation is a fundamental problem in machine learning. Classic multivariate function approximations rely on hand-crafted basis functions (e.g., polynomial basis and Fourier basis), which limits their approximation ability and data adaptation ability, resulting...
Modular Neural Computer
arXiv:2603.13323v1 Announce Type: new Abstract: This paper introduces the Modular Neural Computer (MNC), a memory-augmented neural architecture for exact algorithmic computation on variable-length inputs. The model combines an external associative memory of scalar cells, explicit read and write heads, a...
PolyGLU: State-Conditional Activation Routing in Transformer Feed-Forward Networks
arXiv:2603.13347v1 Announce Type: new Abstract: Biological neural systems employ diverse neurotransmitters -- glutamate, GABA, dopamine, acetylcholine -- to implement distinct signal-processing modalities within shared neural circuits. In contrast, modern transformers apply a single fixed activation function across all feed-forward neurons....
Synthetic Data Generation for Brain-Computer Interfaces: Overview, Benchmarking, and Future Directions
arXiv:2603.12296v1 Announce Type: cross Abstract: Deep learning has achieved transformative performance across diverse domains, largely driven by the large-scale, high-quality training data. In contrast, the development of brain-computer interfaces (BCIs) is fundamentally constrained by the limited, heterogeneous, and privacy-sensitive neural...
The DIME Architecture: A Unified Operational Algorithm for Neural Representation, Dynamics, Control and Integration
arXiv:2603.12286v1 Announce Type: cross Abstract: Modern neuroscience has accumulated extensive evidence on perception, memory, prediction, valuation, and consciousness, yet still lacks an explicit operational architecture capable of integrating these phenomena within a unified computational framework. Existing theories address specific aspects...
Diagnosing Retrieval Bias Under Multiple In-Context Knowledge Updates in Large Language Models
arXiv:2603.12271v1 Announce Type: cross Abstract: LLMs are widely used in knowledge-intensive tasks where the same fact may be revised multiple times within context. Unlike prior work focusing on one-shot updates or single conflicts, multi-update scenarios contain multiple historically valid versions...
DART: Input-Difficulty-AwaRe Adaptive Threshold for Early-Exit DNNs
arXiv:2603.12269v1 Announce Type: cross Abstract: Early-exit deep neural networks enable adaptive inference by terminating computation when sufficient confidence is achieved, reducing cost for edge AI accelerators in resource-constrained settings. Existing methods, however, rely on suboptimal exit policies, ignore input difficulty,...
Unmasking Biases and Reliability Concerns in Convolutional Neural Networks Analysis of Cancer Pathology Images
arXiv:2603.12445v1 Announce Type: cross Abstract: Convolutional Neural Networks have shown promising effectiveness in identifying different types of cancer from radiographs. However, the opaque nature of CNNs makes it difficult to fully understand the way they operate, limiting their assessment to...
98$\times$ Faster LLM Routing Without a Dedicated GPU: Flash Attention, Prompt Compression, and Near-Streaming for the vLLM Semantic Router
arXiv:2603.12646v1 Announce Type: new Abstract: System-level routers that intercept LLM requests for safety classification, domain routing, and PII detection must be both fast and operationally lightweight: they should add minimal latency to every request, yet not require a dedicated GPU...
Interpretable Semantic Gradients in SSD: A PCA Sweep Approach and a Case Study on AI Discourse
arXiv:2603.13038v1 Announce Type: new Abstract: Supervised Semantic Differential (SSD) is a mixed quantitative-interpretive method that models how text meaning varies with continuous individual-difference variables by estimating a semantic gradient in an embedding space and interpreting its poles through clustering and...
Reinforcement Learning for Diffusion LLMs with Entropy-Guided Step Selection and Stepwise Advantages
arXiv:2603.12554v1 Announce Type: cross Abstract: Reinforcement learning (RL) has been effective for post-training autoregressive (AR) language models, but extending these methods to diffusion language models (DLMs) is challenging due to intractable sequence-level likelihoods. Existing approaches therefore rely on surrogate likelihoods...
Spatial PDE-aware Selective State-space with Nested Memory for Mobile Traffic Grid Forecasting
arXiv:2603.12353v1 Announce Type: new Abstract: Traffic forecasting in cellular networks is a challenging spatiotemporal prediction problem due to strong temporal dependencies, spatial heterogeneity across cells, and the need for scalability to large network deployments. Traditional cell-specific models incur prohibitive training...