A Theory of LLM Information Susceptibility
arXiv:2603.23626v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly deployed as optimization modules in agentic systems, yet the fundamental limits of such LLM-mediated improvement remain poorly understood. Here we propose a theory of LLM information susceptibility, centred on...
Lightweight Fairness for LLM-Based Recommendations via Kernelized Projection and Gated Adapters
arXiv:2603.23780v1 Announce Type: new Abstract: Large Language Models (LLMs) have introduced new capabilities to recommender systems, enabling dynamic, context-aware, and conversational recommendations. However, LLM-based recommender systems inherit and may amplify social biases embedded in their pre-training data, especially when demographic...
Probabilistic Geometric Alignment via Bayesian Latent Transport for Domain-Adaptive Foundation Models
arXiv:2603.23783v1 Announce Type: new Abstract: Adapting large-scale foundation models to new domains with limited supervision remains a fundamental challenge due to latent distribution mismatch, unstable optimization dynamics, and miscalibrated uncertainty propagation. This paper introduces an uncertainty-aware probabilistic latent transport framework...
This academic article, while highly technical, signals a key legal development in the IP space related to the **increasing sophistication and adaptability of AI models**. The research on "domain-adaptive foundation models" and "uncertainty-aware probabilistic latent transport" suggests advancements in how AI can be trained and applied across diverse datasets with greater efficiency and reliability. For legal practice, this points to future challenges and opportunities in areas like **data ownership and licensing for AI training, liability for AI outputs trained on diverse data, and the potential for AI to generate more robust and less biased outputs, impacting patentability and copyright issues related to AI-generated content.**
The technical advancements in "Probabilistic Geometric Alignment via Bayesian Latent Transport" present fascinating, albeit indirect, implications for Intellectual Property practice, particularly concerning the patentability of AI-driven innovations and the protection of data-driven models. **Jurisdictional Comparison and Implications Analysis:** The abstract describes a novel framework for domain adaptation in foundation models, focusing on "uncertainty-aware probabilistic latent transport" and "stochastic geometric alignment." This involves sophisticated mathematical and computational techniques to improve model adaptability and robustness. * **United States:** In the US, the patentability of AI algorithms and software is primarily governed by *Alice Corp. v. CLS Bank International*. The key challenge lies in demonstrating that the invention constitutes significantly more than an abstract idea. This paper's framework, with its "Bayesian transport operator," "PAC-Bayesian regularization mechanism," and "theoretical guarantees on convergence stability," presents a strong case for meeting the "inventive concept" requirement. The specific formulation of domain adaptation as a "stochastic geometric alignment problem" and the empirical demonstration of improved performance (e.g., "substantial reduction in latent manifold discrepancy") could help overcome abstract idea rejections by showing a practical application and technical solution to a specific problem in the field of AI. The focus on *how* the model adapts, rather than just the outcome, strengthens the argument for patent eligibility. * **South Korea:** South Korea, while also adhering to principles that prevent the patenting
This article describes a novel approach to adapting large-scale foundation models, which could have significant implications for patentability and infringement analysis in AI/ML. The "uncertainty-aware probabilistic latent transport framework" and its specific components, such as the "Bayesian transport operator" and "PAC-Bayesian regularization mechanism," represent potentially patentable subject matter under 35 U.S.C. § 101, assuming they meet the other criteria of novelty and non-obviousness. For patent prosecution, practitioners should focus on clearly defining the inventive steps related to the *method* of stochastic geometric alignment, the *architecture* incorporating the Bayesian transport operator, and the *system* for domain adaptation that leverages PAC-Bayesian regularization. The theoretical guarantees on convergence stability, loss landscape smoothness, and sample efficiency could serve as strong evidence of unexpected results or advantages, bolstering arguments against obviousness under 35 U.S.C. § 103. In terms of infringement, the detailed description of how this framework "redistributes latent probability mass along Wasserstein-type geodesic trajectories" and "constrains posterior model complexity" provides specific technical details that could be used to identify infringing implementations. Claims drafted around these functional and structural elements would be crucial. For example, a claim covering "a method for adapting a foundation model comprising: applying a Bayesian transport operator to redistribute latent probability mass along Wasserstein-type geodesic trajectories..." would be highly relevant. Regarding validity,
Manifold Generalization Provably Proceeds Memorization in Diffusion Models
arXiv:2603.23792v1 Announce Type: new Abstract: Diffusion models often generate novel samples even when the learned score is only \emph{coarse} -- a phenomenon not accounted for by the standard view of diffusion training as density estimation. In this paper, we show...
This academic article, "Manifold Generalization Provably Proceeds Memorization in Diffusion Models," delves into the underlying mechanisms of how diffusion models generate novel content, even with "coarse" training data. It suggests that these models capture the geometric structure of data rather than memorizing exact distributions, leading to generalization rather than mere replication. For IP practice, this research is highly relevant to the ongoing debates around copyright infringement and fair use in AI-generated content. The finding that diffusion models generalize from data geometry, rather than memorizing specific inputs, could strengthen arguments that AI outputs are transformative and not direct copies, potentially influencing legal interpretations of derivative works and originality in copyright law. This understanding could also inform policy discussions on data licensing for AI training, as it highlights the models' ability to create new content from generalized patterns rather than exact reproductions.
The paper "Manifold Generalization Provably Proceeds Memorization in Diffusion Models" offers a fascinating theoretical lens through which to understand the generative capabilities of diffusion models, particularly their ability to produce novel outputs even with "coarse" training. This insight has significant implications for intellectual property (IP) practice, particularly in the realm of copyright and inventorship, by refining our understanding of what constitutes "originality" and "creation" in AI-generated content. The core argument – that diffusion models capture the *geometry* of data rather than merely memorizing its *distributional structure* – directly challenges the simplistic notion that AI models are merely sophisticated copy machines. If a model is indeed learning underlying manifold regularities and generating outputs based on these learned geometric principles, rather than reproducing specific training data points, it strengthens the argument for the *originality* of AI-generated works. This theoretical underpinning could influence how courts and IP offices assess copyrightability, potentially shifting the focus from direct input-output comparisons to the sophistication of the generative process and the novelty of the resulting output. **Jurisdictional Comparisons and Implications Analysis:** The implications of this research diverge across jurisdictions, reflecting their varying approaches to AI and IP. * **United States:** The U.S. Copyright Office (USCO) has, to date, taken a relatively restrictive stance, emphasizing the need for human authorship in AI-generated works. The USCO's current guidance suggests that works "produced by a machine
This article's findings regarding diffusion models' ability to generate novel samples from coarse scores, by capturing data geometry rather than fine-scale distribution, has significant implications for patent practitioners in the AI/ML space. **Implications for Practitioners:** 1. **Claim Scope and Enablement (35 U.S.C. § 112):** The concept of "coarse scores capturing the geometry of the data" suggests that claims directed to AI models might be enabled even if they don't explicitly define every fine-grained parameter or training detail. If the core innovation lies in *how* the model learns and leverages data geometry for generalization, rather than precise density estimation, then broad claims focusing on this geometric learning could be defensible. Conversely, if an inventor claims a specific "fine-scale distributional structure," but the underlying model operates on coarse geometric principles, the claim might lack adequate written description or enablement for the *actual* invention. This connects to cases like *Ariad Pharmaceuticals, Inc. v. Eli Lilly and Co.* regarding written description, and *The Medicines Co. v. Hospira, Inc.* on enablement, where the specification must teach one of ordinary skill in the art how to make and use the invention without undue experimentation. 2. **Inventive Step/Non-Obviousness (35 U.S.C. § 103):** The article highlights that this generalization behavior "is a phenomenon not accounted
Why the Maximum Second Derivative of Activations Matters for Adversarial Robustness
arXiv:2603.23860v1 Announce Type: new Abstract: This work investigates the critical role of activation function curvature -- quantified by the maximum second derivative $\max|\sigma''|$ -- in adversarial robustness. Using the Recursive Curvature-Tunable Activation Family (RCT-AF), which enables precise control over curvature...
This academic article, while highly technical, signals a growing focus on the intrinsic properties of AI models, specifically activation functions, as a key determinant of "adversarial robustness." From an IP perspective, this research highlights the increasing importance of understanding and potentially patenting novel AI architectures and training methodologies that enhance resilience against adversarial attacks. The identification of an optimal curvature range (4 to 10) for robust generalization could lead to new standards or best practices for developing secure AI, potentially influencing future regulatory discussions around AI safety and reliability.
This research, while highly technical in its focus on activation function curvature and adversarial robustness in AI, carries significant implications for IP practice, particularly concerning the patentability and defensive strategies around AI models. The discovery of an optimal range for $\max|\sigma''|$ (4 to 10) for adversarial robustness suggests a potentially patentable invention in the design and training of AI systems, offering a novel method for improving model security against adversarial attacks. From an IP perspective, this could lead to a surge in patent applications claiming specific activation function designs or training methodologies that leverage this curvature insight. **Jurisdictional Comparison and Implications Analysis:** * **United States:** The U.S. Patent and Trademark Office (USPTO) would likely evaluate claims related to the RCT-AF or its application under the framework of *Alice Corp. v. CLS Bank Int'l*, scrutinizing whether the invention is merely an abstract idea or a patent-eligible application. Claims focusing on the specific mathematical relationship and its tangible impact on AI model robustness (e.g., "a method for training a neural network comprising adjusting activation function parameters to maintain $\max|\sigma''|$ between 4 and 10 to enhance adversarial robustness") would likely fare better than abstract claims. The "technical solution to a technical problem" doctrine, while not explicitly codified, often influences examiners' perspectives on software-related inventions. This research provides a clear technical solution (improved robustness) to a technical problem (adversarial attacks
This article, while focused on machine learning, has significant implications for patent practitioners dealing with AI/ML inventions, particularly concerning enablement, written description, and infringement. The finding that optimal adversarial robustness is tied to a specific range of activation function curvature ($\max|\sigma''|$ between 4 and 10) could be crucial for drafting and prosecuting claims related to robust AI systems. **Implications for Practitioners:** 1. **Enablement (35 U.S.C. § 112(a)):** For claims directed to AI models or methods designed for adversarial robustness, this research provides a concrete technical parameter that could be essential for satisfying enablement. If an inventor claims a robust AI system, merely stating "a robust neural network" might be insufficient if the claimed robustness critically depends on this specific curvature range. Practitioners should consider whether the specification adequately discloses how to achieve and/or measure this curvature, especially if the invention relies on the RCT-AF or similar curvature-tunable activation functions. Failure to disclose such details could lead to enablement rejections, as the public might not be able to make and use the invention without undue experimentation. 2. **Written Description (35 U.S.C. § 112(a)):** Similarly, for inventions where adversarial robustness is a key feature, the written description should ideally demonstrate possession of the invention by detailing how this optimal curvature range is achieved or utilized. If the invention's novelty or
Can we generate portable representations for clinical time series data using LLMs?
arXiv:2603.23987v1 Announce Type: new Abstract: Deploying clinical ML is slow and brittle: models that work at one hospital often degrade under distribution shifts at the next. In this work, we study a simple question -- can large language models (LLMs)...
Lagrangian Relaxation Score-based Generation for Mixed Integer linear Programming
arXiv:2603.24033v1 Announce Type: new Abstract: Predict-and-search (PaS) methods have shown promise for accelerating mixed-integer linear programming (MILP) solving. However, existing approaches typically assume variable independence and rely on deterministic single-point predictions, which limits solution diversityand often necessitates extensive downstream search...
Describe-Then-Act: Proactive Agent Steering via Distilled Language-Action World Models
arXiv:2603.23149v1 Announce Type: new Abstract: Deploying safety-critical agents requires anticipating the consequences of actions before they are executed. While world models offer a paradigm for this proactive foresight, current approaches relying on visual simulation incur prohibitive latencies, often exceeding several...
Graph-Aware Late Chunking for Retrieval-Augmented Generation in Biomedical Literature
arXiv:2603.22633v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) systems for biomedical literature are typically evaluated using ranking metrics like Mean Reciprocal Rank (MRR), which measure how well the system identifies the single most relevant chunk. We argue that for full-text...
Synthetic or Authentic? Building Mental Patient Simulators from Longitudinal Evidence
arXiv:2603.22704v1 Announce Type: new Abstract: Patient simulation is essential for developing and evaluating mental health dialogue systems. As most existing approaches rely on snapshot-style prompts with limited profile information, homogeneous behaviors and incoherent disease progression in multi-turn interactions have become...
Where Experts Disagree, Models Fail: Detecting Implicit Legal Citations in French Court Decisions
arXiv:2603.22973v1 Announce Type: new Abstract: Computational methods applied to legal scholarship hold the promise of analyzing law at scale. We start from a simple question: how often do courts implicitly apply statutory rules? This requires distinguishing legal reasoning from semantic...
Detecting Non-Membership in LLM Training Data via Rank Correlations
arXiv:2603.22707v1 Announce Type: new Abstract: As large language models (LLMs) are trained on increasingly vast and opaque text corpora, determining which data contributed to training has become essential for copyright enforcement, compliance auditing, and user trust. While prior work focuses...
On the use of Aggregation Operators to improve Human Identification using Dental Records
arXiv:2603.23003v1 Announce Type: new Abstract: The comparison of dental records is a standardized technique in forensic dentistry used to speed up the identification of individuals in multiple-comparison scenarios. Specifically, the odontogram comparison is a procedure to compute criteria that will...
MedCausalX: Adaptive Causal Reasoning with Self-Reflection for Trustworthy Medical Vision-Language Models
arXiv:2603.23085v1 Announce Type: new Abstract: Vision-Language Models (VLMs) have enabled interpretable medical diagnosis by integrating visual perception with linguistic reasoning. Yet, existing medical chain-of-thought (CoT) models lack explicit mechanisms to represent and enforce causal reasoning, leaving them vulnerable to spurious...
Whether, Not Which: Mechanistic Interpretability Reveals Dissociable Affect Reception and Emotion Categorization in LLMs
arXiv:2603.22295v1 Announce Type: new Abstract: Large language models appear to develop internal representations of emotion -- "emotion circuits," "emotion neurons," and structured emotional manifolds have been reported across multiple model families. But every study making these claims uses stimuli signalled...
Chain-of-Authorization: Internalizing Authorization into Large Language Models via Reasoning Trajectories
arXiv:2603.22869v1 Announce Type: new Abstract: Large Language Models (LLMs) have become core cognitive components in modern artificial intelligence (AI) systems, combining internal knowledge with external context to perform complex tasks. However, LLMs typically treat all accessible data indiscriminately, lacking inherent...
Beyond Preset Identities: How Agents Form Stances and Boundaries in Generative Societies
arXiv:2603.23406v1 Announce Type: new Abstract: While large language models simulate social behaviors, their capacity for stable stance formation and identity negotiation during complex interventions remains unclear. To overcome the limitations of static evaluations, this paper proposes a novel mixed-methods framework...
Benchmarking Multi-Agent LLM Architectures for Financial Document Processing: A Comparative Study of Orchestration Patterns, Cost-Accuracy Tradeoffs and Production Scaling Strategies
arXiv:2603.22651v1 Announce Type: new Abstract: The adoption of large language models (LLMs) for structured information extraction from financial documents has accelerated rapidly, yet production deployments face fundamental architectural decisions with limited empirical guidance. We present a systematic benchmark comparing four...
Intelligence Inertia: Physical Principles and Applications
arXiv:2603.22347v1 Announce Type: new Abstract: While Landauer's principle establishes the fundamental thermodynamic floor for information erasure and Fisher Information provides a metric for local curvature in parameter space, these classical frameworks function effectively only as approximations within regimes of sparse...
RelayS2S: A Dual-Path Speculative Generation for Real-Time Dialogue
arXiv:2603.23346v1 Announce Type: new Abstract: Real-time spoken dialogue systems face a fundamental tension between latency and response quality. End-to-end speech-to-speech (S2S) models respond immediately and naturally handle turn-taking, backchanneling, and interruption, but produce semantically weaker outputs. Cascaded pipelines (ASR ->...
PersonalQ: Select, Quantize, and Serve Personalized Diffusion Models for Efficient Inference
arXiv:2603.22943v1 Announce Type: new Abstract: Personalized text-to-image generation lets users fine-tune diffusion models into repositories of concept-specific checkpoints, but serving these repositories efficiently is difficult for two reasons: natural-language requests are often ambiguous and can be misrouted to visually similar...
SAiW: Source-Attributable Invisible Watermarking for Proactive Deepfake Defense
arXiv:2603.23178v1 Announce Type: new Abstract: Deepfakes generated by modern generative models pose a serious threat to information integrity, digital identity, and public trust. Existing detection methods are largely reactive, attempting to identify manipulations after they occur and often failing to...
PERMA: Benchmarking Personalized Memory Agents via Event-Driven Preference and Realistic Task Environments
arXiv:2603.23231v1 Announce Type: new Abstract: Empowering large language models with long-term memory is crucial for building agents that adapt to users' evolving needs. However, prior evaluations typically interleave preference-related dialogues with irrelevant conversations, reducing the task to needle-in-a-haystack retrieval while...
Parametric Knowledge and Retrieval Behavior in RAG Fine-Tuning for Electronic Design Automation
arXiv:2603.23047v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) fine-tuning has shown substantial improvements over vanilla RAG, yet most studies target document question answering and often rely on standard NLP metrics that can obscure factual differences. We evaluate RAG fine-tuning for...
Why AI-Generated Text Detection Fails: Evidence from Explainable AI Beyond Benchmark Accuracy
arXiv:2603.23146v1 Announce Type: new Abstract: The widespread adoption of Large Language Models (LLMs) has made the detection of AI-Generated text a pressing and complex challenge. Although many detection systems report high benchmark accuracy, their reliability in real-world settings remains uncertain,...
UniDial-EvalKit: A Unified Toolkit for Evaluating Multi-Faceted Conversational Abilities
arXiv:2603.23160v1 Announce Type: new Abstract: Benchmarking AI systems in multi-turn interactive scenarios is essential for understanding their practical capabilities in real-world applications. However, existing evaluation protocols are highly heterogeneous, differing significantly in dataset formats, model interfaces, and evaluation pipelines, which...
Latent Semantic Manifolds in Large Language Models
arXiv:2603.22301v1 Announce Type: new Abstract: Large Language Models (LLMs) perform internal computations in continuous vector spaces yet produce discrete tokens -- a fundamental mismatch whose geometric consequences remain poorly understood. We develop a mathematical framework that interprets LLM hidden states...
A Multi-Modal CNN-LSTM Framework with Multi-Head Attention and Focal Loss for Real-Time Elderly Fall Detection
arXiv:2603.22313v1 Announce Type: new Abstract: The increasing global aging population has intensified the demand for reliable health monitoring systems, particularly those capable of detecting critical events such as falls among elderly individuals. Traditional fall detection approaches relying on single-modality acceleration...
Trained Persistent Memory for Frozen Decoder-Only LLMs
arXiv:2603.22329v1 Announce Type: new Abstract: Decoder-only language models are stateless: hidden representations are discarded after every forward pass and nothing persists across sessions. Jeong (2026a) showed that trained memory adapters give a frozen encoder-decoder backbone persistent latent-space memory, building on...
Conformal Risk Control for Safety-Critical Wildfire Evacuation Mapping: A Comparative Study of Tabular, Spatial, and Graph-Based Models
arXiv:2603.22331v1 Announce Type: new Abstract: Every wildfire prediction model deployed today shares a dangerous property: none of these methods provides formal guarantees on how much fire spread is missed. Despite extensive work on wildfire spread prediction using deep learning, no...