A Dynamic Self-Evolving Extraction System
arXiv:2603.06915v1 Announce Type: new Abstract: The extraction of structured information from raw text is a fundamental component of many NLP applications, including document retrieval, ranking, and relevance estimation. High-quality extractions often require domain-specific accuracy, up-to-date understanding of specialized taxonomies, and...
KohakuRAG: A simple RAG framework with hierarchical document indexing
arXiv:2603.07612v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) systems that answer questions from document collections face compounding difficulties when high-precision citations are required: flat chunking strategies sacrifice document structure, single-query formulations miss relevant passages through vocabulary mismatch, and single-pass inference...
Whitening Reveals Cluster Commitment as the Geometric Separator of Hallucination Types
arXiv:2603.07755v1 Announce Type: new Abstract: A geometric hallucination taxonomy distinguishes three failure types -- center-drift (Type~1), wrong-well convergence (Type~2), and coverage gaps (Type~3) -- by their signatures in embedding cluster space. Prior work found Types~1 and~2 indistinguishable in full-dimensional contextual...
Scale Dependent Data Duplication
arXiv:2603.06603v1 Announce Type: new Abstract: Data duplication during pretraining can degrade generalization and lead to memorization, motivating aggressive deduplication pipelines. However, at web scale, it is unclear what constitutes a ``duplicate'': beyond surface-form matches, semantically equivalent documents (e.g. translations) may...
Advances in GRPO for Generation Models: A Survey
arXiv:2603.06623v1 Announce Type: new Abstract: Large-scale flow matching models have achieved strong performance across generative tasks such as text-to-image, video, 3D, and speech synthesis. However, aligning their outputs with human preferences and task-specific objectives remains challenging. Flow-GRPO extends Group Relative...
ERP-RiskBench: Leakage-Safe Ensemble Learning for Financial Risk
arXiv:2603.06671v1 Announce Type: new Abstract: Financial risk detection in Enterprise Resource Planning (ERP) systems is an important but underexplored application of machine learning. Published studies in this area tend to suffer from vague dataset descriptions, leakage-prone pipelines, and evaluation practices...
EigenData: A Self-Evolving Multi-Agent Platform for Function-Calling Data Synthesis, Auditing, and Repair
arXiv:2603.05553v1 Announce Type: cross Abstract: Function-calling agents -- large language models that invoke tools and APIs -- require high-quality, domain-specific training data spanning executable environments, backing databases, and diverse multi-turn trajectories. We introduce EigenData, an integrated, self-evolving platform that automates...
Longitudinal Lesion Inpainting in Brain MRI via 3D Region Aware Diffusion
arXiv:2603.05693v1 Announce Type: cross Abstract: Accurate longitudinal analysis of brain MRI is often hindered by evolving lesions, which bias automated neuroimaging pipelines. While deep generative models have shown promise in inpainting these lesions, most existing methods operate cross-sectionally or lack...
Building an Ensemble LLM Semantic Tagger for UN Security Council Resolutions
arXiv:2603.05895v1 Announce Type: new Abstract: This paper introduces a new methodology for using LLM-based systems for accurate and efficient semantic tagging of UN Security Council resolutions. The main goal is to leverage LLM performance variability to build ensemble systems for...
CRIMSON: A Clinically-Grounded LLM-Based Metric for Generative Radiology Report Evaluation
arXiv:2603.06183v1 Announce Type: new Abstract: We introduce CRIMSON, a clinically grounded evaluation framework for chest X-ray report generation that assesses reports based on diagnostic correctness, contextual relevance, and patient safety. Unlike prior metrics, CRIMSON incorporates full clinical context, including patient...
Exploring the ethical, legal, and social implications of cybernetic avatars
A cybernetic avatar (CA) is a concept that encompasses not only avatars representing virtual bodies in cyberspace but also information and communication technology (ICT) and robotic technologies that enhance the physical, cognitive, and perceptual capabilities of humans. CAs can enable...
A Study on the Institutionalization and Legal Improvement of Private Security and Security Services using AI and IoT Technology
Critical perspectives on AI in education: political economy, discrimination, commercialization, governance and ethics
AI in education is not only a challenging area of technical development and educational innovation, but increasingly the focus of critical analysis informed by the social sciences, philosophy and theory. This chapter provides an overview of critical perspectives on AI...
Operationalising AI governance through ethics-based auditing: an industry case study
AbstractEthics-based auditing (EBA) is a structured process whereby an entity’s past or present behaviour is assessed for consistency with moral principles or norms. Recently, EBA has attracted much attention as a governance mechanism that may help to bridge the gap...
Approaches to Protecting Intellectual Property Rights in Open-Source Software and AI-Generated Products, Including Copyright Protection in AI Training.
China’s regulatory approaches to open-source resources and software deserve special attention due to the widespread global use of Chinese-developed solutions. China’s activity in the open-source software sector surged in 2020, laying the foundation for the type of innovations seen today....
Academic Calendar
2025-26 Academic Calendar Please note: All times in U.S. Central. EventDate / Time First Registration Appointment Window (all 3Ls)June 16 (YES opens at 12:35 PM) thru June 22 (YES closes at 11:59 PM) Second Registration Appointment Window (all 2Ls/3Ls)June 23...
AI Legal Insight Analyser (ALIA)
The AI Legal Insight Analyzer (ALIA) is a smart web application designed to make legal document analysis faster, easier, and more accurate. By combining artificial intelligence (AI) with natural language processing (NLP), ALIA helps legal professionals, researchers, and students efficiently...
Good models borrow, great models steal: intellectual property rights and generative AI
Abstract Two critical policy questions will determine the impact of generative artificial intelligence (AI) on the knowledge economy and the creative sector. The first concerns how we think about the training of such models—in particular, whether the creators or owners...
Regulatory Settlement, Stare Decisis, and Loper Bright
In Loper Bright v. Raimondo, the Supreme Court adopted and deployed a particular narrative about agency action in support of overruling Chevron: Agencies reverse their own statutory interpretations “as much as [they] like[],” creating pervasive instability in the law, thereby...
Per Se Non-Takings
Introduction Contestation over methodology remains an enduring friction point in the discourse on the Takings Clause. For decades, the Supreme Court’s takings jurisprudence has vacillated between categorical, per se reasoning and contextual, ad hoc inquiries into what fairness and justice...
Recent Policies, Regulations and Laws Related to Artificial Intelligence Across the Central Asia
Artificial Intelligence as technology is developing fast in the Central Asian Region. In Post COVID World, it is expected to change the people’s lives by improving healthcare (e.g. making diagnosis more precise, enabling better prevention of diseases), increasing the efficiency...
Machine Ethics: The Design and Governance of Ethical AI and Autonomous Systems [Scanning the Issue]
The so-called fourth industrial revolution and its economic and societal implications are no longer solely an academic concern, but a matter for political as well as public debate. Characterized as the convergence of robotics, AI, autonomous systems and information technology...
Reconstituting Corporate Power & Accountability
Introduction Modern society faces a paradox: While corporations can be useful engines of innovation and value creation, they increasingly operate as vectors for profound public harm beyond the reach of public regulation. The “economic and human tolls,” experts note, “almost...
Patents’ “Self-Consistency” Question: Diversion and Blocking Under a Patent-Racing Model
Introduction The United States patent system is commonly justified by its provision of economic incentives for innovation.[1] But this justification comes with constant concern that the social benefits of innovation that the patent system stimulates might not outweigh the sum...
“Proven” Safety Regulations: Massachusetts 1805 Proving Law As Historical Analogue for Modern Gun Safety Laws lawreview - Minnesota Law Review
By Billy Clark. Full Text. Concerned by the public health threats posed by certain firearms, the Massachusetts legislature enacts a law to set safety standards for firearms in the Commonwealth. Firearm dealers across the State, including some of the leading...
Securitising AI: routine exceptionality and digital governance in the Gulf
Abstract This article examines how Gulf Cooperation Council (GCC) states securitise artificial intelligence (AI) through discourses and infrastructures that fuse modernisation with regime resilience. Drawing on securitisation theory (Buzan et al., 1998; Balzacq, 2011) and critical security studies, it analyses...
IP’s Pluralism Puzzle
Introduction At the core of intellectual property (IP) law lies a fundamental question of political philosophy: Can any argument justify the state’s grant of private property rights in intangibles?[1] To this question, scholars have responded that IP rights can be...
Technologies of Violence: Law, Markets, and Innovation for Gun Safety
Introduction Guns play a variety of roles in American life—as tools of crime and self-defense, political symbols, markers of individual identity, instruments of recreation, and more. But at the most basic level, guns are a technology designed to inflict violence,...