Sparse Crosscoders for diffing MoEs and Dense models
arXiv:2603.05805v1 Announce Type: new Abstract: Mixture of Experts (MoE) achieve parameter-efficient scaling through sparse expert routing, yet their internal representations remain poorly understood compared to dense models. We present a systematic comparison of MoE and dense model internals using crosscoders,...
Reference-guided Policy Optimization for Molecular Optimization via LLM Reasoning
arXiv:2603.05900v1 Announce Type: new Abstract: Large language models (LLMs) benefit substantially from supervised fine-tuning (SFT) and reinforcement learning with verifiable rewards (RLVR) in reasoning tasks. However, these recipes perform poorly in instruction-based molecular optimization, where each data point typically provides...
Stock Market Prediction Using Node Transformer Architecture Integrated with BERT Sentiment Analysis
arXiv:2603.05917v1 Announce Type: new Abstract: Stock market prediction presents considerable challenges for investors, financial institutions, and policymakers operating in complex market environments characterized by noise, non-stationarity, and behavioral dynamics. Traditional forecasting methods often fail to capture the intricate patterns and...
Design Experiments to Compare Multi-armed Bandit Algorithms
arXiv:2603.05919v1 Announce Type: new Abstract: Online platforms routinely compare multi-armed bandit algorithms, such as UCB and Thompson Sampling, to select the best-performing policy. Unlike standard A/B tests for static treatments, each run of a bandit algorithm over $T$ users produces...
Omni-Masked Gradient Descent: Memory-Efficient Optimization via Mask Traversal with Improved Convergence
arXiv:2603.05960v1 Announce Type: new Abstract: Memory-efficient optimization methods have recently gained increasing attention for scaling full-parameter training of large language models under the GPU-memory bottleneck. Existing approaches either lack clear convergence guarantees, or only achieve the standard ${\mathcal{O}}(\epsilon^{-4})$ iteration complexity...
EvoESAP: Non-Uniform Expert Pruning for Sparse MoE
arXiv:2603.06003v1 Announce Type: new Abstract: Sparse Mixture-of-Experts (SMoE) language models achieve strong capability at low per-token compute, yet deployment remains memory- and throughput-bound because the full expert pool must be stored and served. Post-training expert pruning reduces this cost, but...
Preventing Learning Stagnation in PPO by Scaling to 1 Million Parallel Environments
arXiv:2603.06009v1 Announce Type: new Abstract: Plateaus, where an agent's performance stagnates at a suboptimal level, are a common problem in deep on-policy RL. Focusing on PPO due to its widespread adoption, we show that plateaus in certain regimes arise not...
Latent Diffusion-Based 3D Molecular Recovery from Vibrational Spectra
arXiv:2603.06113v1 Announce Type: new Abstract: Infrared (IR) spectroscopy, a type of vibrational spectroscopy, is widely used for molecular structure determination and provides critical structural information for chemists. However, existing approaches for recovering molecular structures from IR spectra typically rely on...
Dynamic Momentum Recalibration in Online Gradient Learning
arXiv:2603.06120v1 Announce Type: new Abstract: Stochastic Gradient Descent (SGD) and its momentum variants form the backbone of deep learning optimization, yet the underlying dynamics of their gradient behavior remain insufficiently understood. In this work, we reinterpret gradient updates through the...
Partial Policy Gradients for RL in LLMs
arXiv:2603.06138v1 Announce Type: new Abstract: Reinforcement learning is a framework for learning to act sequentially in an unknown environment. We propose a natural approach for modeling policy structure in policy gradients. The key idea is to optimize for a subset...
Ensemble Graph Neural Networks for Probabilistic Sea Surface Temperature Forecasting via Input Perturbations
arXiv:2603.06153v1 Announce Type: new Abstract: Accurate regional ocean forecasting requires models that are both computationally efficient and capable of representing predictive uncertainty. This work investigates ensemble learning strategies for sea surface temperature (SST) forecasting using Graph Neural Networks (GNNs), with...
Topological descriptors of foot clearance gait dynamics improve differential diagnosis of Parkinsonism
arXiv:2603.06212v1 Announce Type: new Abstract: Differential diagnosis among parkinsonian syndromes remains a clinical challenge due to overlapping motor symptoms and subtle gait abnormalities. Accurate differentiation is crucial for treatment planning and prognosis. While gait analysis is a well established approach...
FedSCS-XGB -- Federated Server-centric surrogate XGBoost for continual health monitoring
arXiv:2603.06224v1 Announce Type: new Abstract: Wearable sensors with local data processing can detect health threats early, enhance documentation, and support personalized therapy. In the context of spinal cord injury (SCI), which involves risks such as pressure injuries and blood pressure...
DC-Merge: Improving Model Merging with Directional Consistency
arXiv:2603.06242v1 Announce Type: new Abstract: Model merging aims to integrate multiple task-adapted models into a unified model that preserves the knowledge of each task. In this paper, we identify that the key to this knowledge retention lies in maintaining the...
Google just gave Sundar Pichai a $692M pay package
Most of it is tied to performance, including new stock incentives linked to Waymo and Wing, its drone delivery venture.
The Border Politics of Patents and the Immigrant Inventor
Introduction In the twenty-first-century United States, patents—government grants of exclusive rights to the originator of a new and useful invention—are part of the politics of the border.[1] Patents are relevant to the U.S. border in at least three ways. First,...
Algorithmic and Non-Algorithmic Fairness: Should We Revise our View of the Latter Given Our View of the Former?
Abstract In the US context, critics of court use of algorithmic risk prediction algorithms have argued that COMPAS involves unfair machine bias because it generates higher false positive rates of predicted recidivism for black offenders than for white offenders. In...
Balancing Privacy and Progress: A Review of Privacy Challenges, Systemic Oversight, and Patient Perceptions in AI-Driven Healthcare
Integrating Artificial Intelligence (AI) in healthcare represents a transformative shift with substantial potential for enhancing patient care. This paper critically examines this integration, confronting significant ethical, legal, and technological challenges, particularly in patient privacy, decision-making autonomy, and data integrity. A...
From AI security to ethical AI security: a comparative risk-mitigation framework for classical and hybrid AI governance
Abstract As Artificial Intelligence (AI) systems evolve from classical to hybrid classical-quantum architectures, traditional notions of security—mainly centered on technical robustness—are no longer sufficient. This study aims to provide an integrated security ethics compliance framework that bridges technical and ethical...
Big Data�s Disparate Impact
Advocates of algorithmic techniques like data mining argue that these techniques eliminate human biases from the decision-making process. But an algorithm is only as good as the data it works with. Data is frequently imperfect in ways that allow these...
The Dilemma and Countermeasures of AI in Educational Application
This paper divides the application of AI in education into three categories, namely, students-oriented AI, teachers-oriented AI and school mangers -oriented AI, which focuses on the individualized self-adaptive learning of students, the assisted teaching of teachers and the service management...
Exacerbating Algorithmic Bias through Fairness Attacks
Algorithmic fairness has attracted significant attention in recent years, with many quantitative measures suggested for characterizing the fairness of different machine learning algorithms. Despite this interest, the robustness of those fairness measures with respect to an intentional adversarial attack has...
Beyond Personhood
This paper examines the evolution of legal personhood and explores whether historical precedents—from corporate personhood to environmental legal recognition—can inform frameworks for governing artificial intelligence (AI). By tracing the development of persona ficta in Roman law and subsequent expansions of...
Digital Monsters: Reconciling AI Narratives as Investigations of Legal Personhood for Artificial Intelligence
Cultural legal investigations of the nexus between law, culture and society are crucial for developing our understanding of how the relationships between humans and artificially intelligent entities (AIE) will evolve along with the technology itself. However, narratives of artificial intelligence...
Demystifying the Draft EU Artificial Intelligence Act — Analysing the good, the bad, and the unclear elements of the proposed approach
AI standardization promises to support the implementation of EU legislation and promote the rapid transfer,transparency, and interoperability of this massively disruptive technology. However, apart from well-known practical difficulties stemming from the unique probabilistic nature and the rapid development of AI...
Critical perspectives on AI in education: political economy, discrimination, commercialization, governance and ethics
AI in education is not only a challenging area of technical development and educational innovation, but increasingly the focus of critical analysis informed by the social sciences, philosophy and theory. This chapter provides an overview of critical perspectives on AI...
Computation of minimum-time feedback control laws for discrete-time systems with state-control constraints
The problem of finding a feedback law that drives the state of a linear discrete-time system to the origin in minimum-time subject to state-control constraints is considered. Algorithms are given to obtain facial descriptions of the <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">M</tex> -step...
Predicting Outcomes of Legal Cases based on Legal Factors using Classifiers
Predicting outcomes of legal cases may aid in the understanding of the judicial decision-making process. Outcomes can be predicted based on i) case-specific legal factors such as type of evidence ii) extra-legal factors such as the ideological direction of the...
Civil law regulation of artificial Intelligence in the Russian Federation
The purpose of this article is to identify the normative gaps in the legal regulation of the use of artificial intelligence technology and related systems, as well as to identify the degree of need for a more comprehensive legal regulation....
Exploring the ethical, legal, and social implications of cybernetic avatars
A cybernetic avatar (CA) is a concept that encompasses not only avatars representing virtual bodies in cyberspace but also information and communication technology (ICT) and robotic technologies that enhance the physical, cognitive, and perceptual capabilities of humans. CAs can enable...