Natural Language Processing Models for Robust Document Categorization
arXiv:2602.20336v1 Announce Type: new Abstract: This article presents an evaluation of several machine learning methods applied to automated text classification, alongside the design of a demonstrative system for unbalanced document categorization and distribution. The study focuses on balancing classification accuracy...
How communicatively optimal are exact numeral systems? Once more on lexicon size and morphosyntactic complexity
arXiv:2602.20372v1 Announce Type: new Abstract: Recent research argues that exact recursive numeral systems optimize communicative efficiency by balancing a tradeoff between the size of the numeral lexicon and the average morphosyntactic complexity (roughly length in morphemes) of numeral terms. We...
Disentangling Geometry, Performance, and Training in Language Models
arXiv:2602.20433v1 Announce Type: new Abstract: Geometric properties of Transformer weights, particularly the unembedding matrix, have been widely useful in language model interpretability research. Yet, their utility for estimating downstream performance remains unclear. In this work, we systematically investigate the relationship...
A Hierarchical Multi-Agent System for Autonomous Discovery in Geoscientific Data Archives
arXiv:2602.21351v1 Announce Type: new Abstract: The rapid accumulation of Earth science data has created a significant scalability challenge; while repositories like PANGAEA host vast collections of datasets, citation metrics indicate that a substantial portion remains underutilized, limiting data reusability. Here...
ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning
arXiv:2602.21534v1 Announce Type: new Abstract: Agentic reinforcement learning (ARL) has rapidly gained attention as a promising paradigm for training agents to solve complex, multi-step interactive tasks. Despite encouraging early results, ARL remains highly unstable, often leading to training collapse. This...
Distill and Align Decomposition for Enhanced Claim Verification
arXiv:2602.21857v1 Announce Type: new Abstract: Complex claim verification requires decomposing sentences into verifiable subclaims, yet existing methods struggle to align decomposition quality with verification performance. We propose a reinforcement learning (RL) approach that jointly optimizes decomposition quality and verifier alignment...
ProactiveMobile: A Comprehensive Benchmark for Boosting Proactive Intelligence on Mobile Devices
arXiv:2602.21858v1 Announce Type: new Abstract: Multimodal large language models (MLLMs) have made significant progress in mobile agent development, yet their capabilities are predominantly confined to a reactive paradigm, where they merely execute explicit user commands. The emerging paradigm of proactive...
Semantic Partial Grounding via LLMs
arXiv:2602.22067v1 Announce Type: new Abstract: Grounding is a critical step in classical planning, yet it often becomes a computational bottleneck due to the exponential growth in grounded actions and atoms as task size increases. Recent advances in partial grounding have...
Inference-time Alignment via Sparse Junction Steering
arXiv:2602.21215v1 Announce Type: cross Abstract: Token-level steering has emerged as a pivotal approach for inference-time alignment, enabling fine grained control over large language models by modulating their output distributions without parameter updates. While effective, existing methods rely on dense intervention...
EQ-5D Classification Using Biomedical Entity-Enriched Pre-trained Language Models and Multiple Instance Learning
arXiv:2602.21216v1 Announce Type: cross Abstract: The EQ-5D (EuroQol 5-Dimensions) is a standardized instrument for the evaluation of health-related quality of life. In health economics, systematic literature reviews (SLRs) depend on the correct identification of publications that use the EQ-5D, but...
EPSVec: Efficient and Private Synthetic Data Generation via Dataset Vectors
arXiv:2602.21218v1 Announce Type: cross Abstract: High-quality data is essential for modern machine learning, yet many valuable corpora are sensitive and cannot be freely shared. Synthetic data offers a practical substitute for downstream development, and large language models (LLMs) have emerged...
Field-Theoretic Memory for AI Agents: Continuous Dynamics for Context Preservation
arXiv:2602.21220v1 Announce Type: cross Abstract: We present a memory system for AI agents that treats stored information as continuous fields governed by partial differential equations rather than discrete entries in a database. The approach draws from classical field theory: memories...
Task-Aware LoRA Adapter Composition via Similarity Retrieval in Vector Databases
arXiv:2602.21222v1 Announce Type: cross Abstract: Parameter efficient fine tuning methods like LoRA have enabled task specific adaptation of large language models, but efficiently composing multiple specialized adapters for unseen tasks remains challenging. We present a novel framework for dynamic LoRA...
Architecture-Agnostic Curriculum Learning for Document Understanding: Empirical Evidence from Text-Only and Multimodal
arXiv:2602.21225v1 Announce Type: cross Abstract: We investigate whether progressive data scheduling -- a curriculum learning strategy that incrementally increases training data exposure (33\%$\rightarrow$67\%$\rightarrow$100\%) -- yields consistent efficiency gains across architecturally distinct document understanding models. By evaluating BERT (text-only, 110M parameters)...
IslamicLegalBench: Evaluating LLMs Knowledge and Reasoning of Islamic Law Across 1,200 Years of Islamic Pluralist Legal Traditions
arXiv:2602.21226v1 Announce Type: cross Abstract: As millions of Muslims turn to LLMs like GPT, Claude, and DeepSeek for religious guidance, a critical question arises: Can these AI systems reliably reason about Islamic law? We introduce IslamicLegalBench, the first benchmark evaluating...
The Fundamental Right to Education
ARTICLE The Fundamental Right to Education Derek W. Black* New litigation has revived one of the most important questions of constitutional law: Is education a fundamental right? The Court’s previous answers have been disappointing. While the Court has hinted that...
The Discrimination Presumption
ARTICLE The Discrimination Presumption Joseph A. Seiner* Employment discrimination is a fact in our society. Scientific studies continue to show that employer misconduct in the workplace is pervasive. This social science research is further supported by governmental data and litigation...
The New Oral Argument: Justices as Advocates
ARTICLE The New Oral Argument: Justices as Advocates Tonja Jacobi* & Matthew Sag** This Article conducts a comprehensive empirical inquiry of fifty-five years of Supreme Court oral argument, showing that judicial activity has increased dramatically, in terms of words used,...
Transborder Speech
ARTICLE Transborder Speech Ronald J. Krotoszynski, Jr.* In an increasingly globalized marketplace of ideas, First Amendment law and theory must recognize that the freedom of speech does not end at the water’s edge. Simply put, the locus of expressive activity...
Gains, Losses, and Judges: Framing and the Judiciary
ARTICLE Gains, Losses, and Judges: Framing and the Judiciary Jeffrey J. Rachlinski* & Andrew J. Wistrich** Losses hurt more than foregone gains—an asymmetry that psychologists call “loss aversion.” Losses cause more regret than foregone gains, and people struggle harder to...
Corporate Governance in the Age of AI: Board Responsibilities and Best Practices
As AI transforms business operations, corporate boards face new governance challenges requiring updated oversight frameworks and expertise.
Fintech Regulation 2026: Navigating the New Compliance Landscape
The regulatory environment for fintech has evolved dramatically, with new frameworks addressing digital assets, open banking, and AI-driven financial services.
Autonomous Vehicles and Liability: Who Is Responsible When AI Drives?
As autonomous vehicles approach widespread deployment, legal frameworks for determining liability in accidents involving self-driving cars remain uncertain.
Budget-Aware Agentic Routing via Boundary-Guided Training
arXiv:2602.21227v1 Announce Type: cross Abstract: As large language models (LLMs) evolve into autonomous agents that execute long-horizon workflows, invoking a high-capability model at every step becomes economically unsustainable. While model routing is effective for single-turn queries, agentic routing is a...
ImpRIF: Stronger Implicit Reasoning Leads to Better Complex Instruction Following
arXiv:2602.21228v1 Announce Type: cross Abstract: As applications of large language models (LLMs) become increasingly complex, the demand for robust complex instruction following capabilities is growing accordingly. We argue that a thorough understanding of the instruction itself, especially the latent reasoning...
ACAR: Adaptive Complexity Routing for Multi-Model Ensembles with Auditable Decision Traces
arXiv:2602.21231v1 Announce Type: cross Abstract: We present ACAR (Adaptive Complexity and Attribution Routing), a measurement framework for studying multi-model orchestration under auditable conditions. ACAR uses self-consistency variance (sigma) computed from N=3 probe samples to route tasks across single-model, two-model, and...
AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression
arXiv:2602.21233v1 Announce Type: cross Abstract: This technical report introduces AngelSlim, a comprehensive and versatile toolkit for large model compression developed by the Tencent Hunyuan team. By consolidating cutting-edge algorithms, including quantization, speculative decoding, token pruning, and distillation. AngelSlim provides a...
AgenticTyper: Automated Typing of Legacy Software Projects Using Agentic AI
arXiv:2602.21251v1 Announce Type: cross Abstract: Legacy JavaScript systems lack type safety, making maintenance risky. While TypeScript can help, manually adding types is expensive. Previous automated typing research focuses on type inference but rarely addresses type checking setup, definition generation, bug...
A Systematic Review of Algorithmic Red Teaming Methodologies for Assurance and Security of AI Applications
arXiv:2602.21267v1 Announce Type: cross Abstract: Cybersecurity threats are becoming increasingly sophisticated, making traditional defense mechanisms and manual red teaming approaches insufficient for modern organizations. While red teaming has long been recognized as an effective method to identify vulnerabilities by simulating...
Group Orthogonalized Policy Optimization:Group Policy Optimization as Orthogonal Projection in Hilbert Space
arXiv:2602.21269v1 Announce Type: cross Abstract: We present Group Orthogonalized Policy Optimization (GOPO), a new alignment algorithm for large language models derived from the geometry of Hilbert function spaces. Instead of optimizing on the probability simplex and inheriting the exponential curvature...