Litigation

LOW Academic International

L-PRISMA: An Extension of PRISMA in the Era of Generative Artificial Intelligence (GenAI)

arXiv:2603.19236v1 Announce Type: cross Abstract: The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) framework provides a rigorous foundation for evidence synthesis, yet the manual processes of data extraction and literature screening remain time-consuming and restrictive. Recent advances in...

1 min 3 weeks, 5 days ago

evidence

LOW Academic International

Framing Effects in Independent-Agent Large Language Models: A Cross-Family Behavioral Analysis

arXiv:2603.19282v1 Announce Type: cross Abstract: In many real-world applications, large language models (LLMs) operate as independent agents without interaction, thereby limiting coordination. In this setting, we examine how prompt framing influences decisions in a threshold voting task involving individual-group interest...

1 min 3 weeks, 5 days ago

trial

LOW Academic International

A Visualization for Comparative Analysis of Regression Models

arXiv:2603.19291v1 Announce Type: cross Abstract: As regression is a widely studied problem, many methods have been proposed to solve it, each of them often requiring setting different hyper-parameters. Therefore, selecting the proper method for a given application may be very...

1 min 3 weeks, 5 days ago

standing

LOW Academic International

Spelling Correction in Healthcare Query-Answer Systems: Methods, Retrieval Impact, and Empirical Evaluation

arXiv:2603.19249v1 Announce Type: new Abstract: Healthcare question-answering (QA) systems face a persistent challenge: users submit queries with spelling errors at rates substantially higher than those found in the professional documents they search. This paper presents the first controlled study of...

1 min 3 weeks, 5 days ago

evidence

LOW Academic International

From Comprehension to Reasoning: A Hierarchical Benchmark for Automated Financial Research Reporting

arXiv:2603.19254v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used to generate financial research reports, shifting from auxiliary analytic tools to primary content producers. Yet recent real-world deployments reveal persistent failures--factual errors, numerical inconsistencies, fabricated references, and shallow...

1 min 3 weeks, 5 days ago

standing

LOW Academic International

From Tokens To Agents: A Researcher's Guide To Understanding Large Language Models

arXiv:2603.19269v1 Announce Type: new Abstract: Researchers face a critical choice: how to use -- or not use -- large language models in their work. Using them well requires understanding the mechanisms that shape what LLMs can and cannot do. This...

1 min 3 weeks, 5 days ago

standing

LOW Academic International

Automated Motif Indexing on the Arabian Nights

arXiv:2603.19283v1 Announce Type: new Abstract: Motifs are non-commonplace, recurring narrative elements, often found originally in folk stories. In addition to being of interest to folklorists, motifs appear as metaphoric devices in modern news, literature, propaganda, and other cultural texts. Finding...

1 min 3 weeks, 5 days ago

standing

LOW Academic International

Memory-Driven Role-Playing: Evaluation and Enhancement of Persona Knowledge Utilization in LLMs

arXiv:2603.19313v1 Announce Type: new Abstract: A core challenge for faithful LLM role-playing is sustaining consistent characterization throughout long, open-ended dialogues, as models frequently fail to recall and accurately apply their designated persona knowledge without explicit cues. To tackle this, we...

1 min 3 weeks, 5 days ago

motion

LOW Academic International

Prompt-tuning with Attribute Guidance for Low-resource Entity Matching

arXiv:2603.19321v1 Announce Type: new Abstract: Entity Matching (EM) is an important task that determines the logical relationship between two entities, such as Same, Different, or Undecidable. Traditional EM approaches rely heavily on supervised learning, which requires large amounts of high-quality...

1 min 3 weeks, 5 days ago

standing

LOW Academic International

Is Evaluation Awareness Just Format Sensitivity? Limitations of Probe-Based Evidence under Controlled Prompt Structure

arXiv:2603.19426v1 Announce Type: new Abstract: Prior work uses linear probes on benchmark prompts as evidence of evaluation awareness in large language models. Because evaluation context is typically entangled with benchmark format and genre, it is unclear whether probe-based signals reflect...

1 min 3 weeks, 5 days ago

evidence

LOW Academic International

EvidenceRL: Reinforcing Evidence Consistency for Trustworthy Language Models

arXiv:2603.19532v1 Announce Type: new Abstract: Large Language Models (LLMs) are fluent but prone to hallucinations, producing answers that appear plausible yet are unsupported by available evidence. This failure is especially problematic in high-stakes domains where decisions must be justified by...

1 min 3 weeks, 5 days ago

evidence

LOW Academic International

BEAVER: A Training-Free Hierarchical Prompt Compression Method via Structure-Aware Page Selection

arXiv:2603.19635v1 Announce Type: new Abstract: The exponential expansion of context windows in LLMs has unlocked capabilities for long-document understanding but introduced severe bottlenecks in inference latency and information utilization. Existing compression methods often suffer from high training costs or semantic...

1 min 3 weeks, 5 days ago

standing

LOW Academic International

Prune-then-Quantize or Quantize-then-Prune? Understanding the Impact of Compression Order in Joint Model Compression

arXiv:2603.18426v1 Announce Type: new Abstract: What happens when multiple compression methods are combined-does the order in which they are applied matter? Joint model compression has emerged as a powerful strategy to achieve higher efficiency by combining multiple methods such as...

1 min 4 weeks, 1 day ago

standing

LOW Academic International

TherapyGym: Evaluating and Aligning Clinical Fidelity and Safety in Therapy Chatbots

arXiv:2603.18008v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used for mental-health support; yet prevailing evaluation methods--fluency metrics, preference tests, and generic dialogue benchmarks--fail to capture the clinically critical dimensions of psychotherapy. We introduce THERAPYGYM, a framework that...

1 min 4 weeks, 1 day ago

evidence

LOW Academic International

Large-Scale Analysis of Political Propaganda on Moltbook

arXiv:2603.18349v1 Announce Type: new Abstract: We present an NLP-based study of political propaganda on Moltbook, a Reddit-style platform for AI agents. To enable large-scale analysis, we develop LLM-based classifiers to detect political propaganda, validated against expert annotation (Cohen's $\kappa$= 0.64-0.74)....

1 min 4 weeks, 1 day ago

evidence

LOW Academic International

How Confident Is the First Token? An Uncertainty-Calibrated Prompt Optimization Framework for Large Language Model Classification and Understanding

arXiv:2603.18009v1 Announce Type: new Abstract: With the widespread adoption of large language models (LLMs) in natural language processing, prompt engineering and retrieval-augmented generation (RAG) have become mainstream to enhance LLMs' performance on complex tasks. However, LLMs generate outputs autoregressively, leading...

1 min 4 weeks, 1 day ago

standing

LOW Academic International

Multi-Trait Subspace Steering to Reveal the Dark Side of Human-AI Interaction

arXiv:2603.18085v1 Announce Type: new Abstract: Recent incidents have highlighted alarming cases where human-AI interactions led to negative psychological outcomes, including mental health crises and even user harm. As LLMs serve as sources of guidance, emotional support, and even informal therapy,...

1 min 4 weeks, 1 day ago

motion

LOW Academic International

BenchBrowser -- Collecting Evidence for Evaluating Benchmark Validity

arXiv:2603.18019v1 Announce Type: new Abstract: Do language model benchmarks actually measure what practitioners intend them to ? High-level metadata is too coarse to convey the granular reality of benchmarks: a "poetry" benchmark may never test for haikus, while "instruction-following" benchmarks...

1 min 4 weeks, 1 day ago

evidence

LOW Academic International

MedForge: Interpretable Medical Deepfake Detection via Forgery-aware Reasoning

arXiv:2603.18577v1 Announce Type: new Abstract: Text-guided image editors can now manipulate authentic medical scans with high fidelity, enabling lesion implantation/removal that threatens clinical trust and safety. Existing defenses are inadequate for healthcare. Medical detectors are largely black-box, while MLLM-based explainers...

1 min 4 weeks, 1 day ago

evidence

LOW Academic International

Controllable Evidence Selection in Retrieval-Augmented Question Answering via Deterministic Utility Gating

arXiv:2603.18011v1 Announce Type: new Abstract: Many modern AI question-answering systems convert text into vectors and retrieve the closest matches to a user question. While effective for topical similarity, similarity scores alone do not explain why some retrieved text can serve...

1 min 4 weeks, 1 day ago

evidence

LOW Academic International

D-Mem: A Dual-Process Memory System for LLM Agents

arXiv:2603.18631v1 Announce Type: new Abstract: Driven by the development of persistent, self-adapting autonomous agents, equipping these systems with high-fidelity memory access for long-horizon reasoning has emerged as a critical requirement. However, prevalent retrieval-based memory frameworks often follow an incremental processing...

1 min 4 weeks, 1 day ago

standing

LOW Academic International

From Topic to Transition Structure: Unsupervised Concept Discovery at Corpus Scale via Predictive Associative Memory

arXiv:2603.18420v1 Announce Type: new Abstract: Embedding models group text by semantic content, what text is about. We show that temporal co-occurrence within texts discovers a different kind of structure: recurrent transition-structure concepts or what text does. We train a 29.4M-parameter...

1 min 4 weeks, 1 day ago

discovery

LOW Academic International

Learned but Not Expressed: Capability-Expression Dissociation in Large Language Models

arXiv:2603.18013v1 Announce Type: new Abstract: Large language models (LLMs) demonstrate the capacity to reconstruct and trace learned content from their training data under specific elicitation conditions, yet this capability does not manifest in standard generation contexts. This empirical observational study...

1 min 4 weeks, 1 day ago

standing

LOW Academic International

UT-ACA: Uncertainty-Triggered Adaptive Context Allocation for Long-Context Inference

arXiv:2603.18446v1 Announce Type: new Abstract: Long-context inference remains challenging for large language models due to attention dilution and out-of-distribution degradation. Context selection mitigates this limitation by attending to a subset of key-value cache entries, yet most methods allocate a fixed...

1 min 4 weeks, 1 day ago

evidence

LOW Academic International

DiscoPhon: Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units

arXiv:2603.18612v1 Announce Type: new Abstract: We introduce DiscoPhon, a multilingual benchmark for evaluating unsupervised phoneme discovery from discrete speech units. DiscoPhon covers 6 dev and 6 test languages, chosen to span a wide range of phonemic contrasts. Given only 10...

1 min 4 weeks, 1 day ago

discovery

LOW Academic International

Evaluating LLM-Generated Lessons from the Language Learning Students' Perspective: A Short Case Study on Duolingo

arXiv:2603.18873v1 Announce Type: new Abstract: Popular language learning applications such as Duolingo use large language models (LLMs) to generate lessons for its users. Most lessons focus on general real-world scenarios such as greetings, ordering food, or asking directions, with limited...

1 min 4 weeks, 1 day ago

standing

LOW Academic International

Hypothesis-Conditioned Query Rewriting for Decision-Useful Retrieval

arXiv:2603.19008v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) improves Large Language Models (LLMs) by grounding generation in external, non-parametric knowledge. However, when a task requires choosing among competing options, simply grounding generation in broadly relevant context is often insufficient to...

1 min 4 weeks, 1 day ago

evidence

LOW Academic International

Towards Differentiating Between Failures and Domain Shifts in Industrial Data Streams

arXiv:2603.18032v1 Announce Type: new Abstract: Anomaly and failure detection methods are crucial in identifying deviations from normal system operational conditions, which allows for actions to be taken in advance, usually preventing more serious damages. Long-lasting deviations indicate failures, while sudden,...

1 min 4 weeks, 1 day ago

trial

LOW Academic International

Quotient Geometry and Persistence-Stable Metrics for Swarm Configurations

arXiv:2603.18041v1 Announce Type: new Abstract: Swarm and constellation reconfiguration can be viewed as motion of an unordered point configuration in an ambient space. Here, we provide persistence-stable, symmetry-invariant geometric representations for comparing and monitoring multi-agent configuration data. We introduce a...

1 min 4 weeks, 1 day ago

motion

LOW Academic International

AGRI-Fidelity: Evaluating the Reliability of Listenable Explanations for Poultry Disease Detection

arXiv:2603.18247v1 Announce Type: new Abstract: Existing XAI metrics measure faithfulness for a single model, ignoring model multiplicity where near-optimal classifiers rely on different or spurious acoustic cues. In noisy farm environments, stationary artifacts such as ventilation noise can produce explanations...

1 min 4 weeks, 1 day ago

discovery

L-PRISMA: An Extension of PRISMA in the Era of Generative Artificial Intelligence (GenAI)

Framing Effects in Independent-Agent Large Language Models: A Cross-Family Behavioral Analysis

A Visualization for Comparative Analysis of Regression Models

Spelling Correction in Healthcare Query-Answer Systems: Methods, Retrieval Impact, and Empirical Evaluation

From Comprehension to Reasoning: A Hierarchical Benchmark for Automated Financial Research Reporting

From Tokens To Agents: A Researcher's Guide To Understanding Large Language Models

Automated Motif Indexing on the Arabian Nights

Memory-Driven Role-Playing: Evaluation and Enhancement of Persona Knowledge Utilization in LLMs

Prompt-tuning with Attribute Guidance for Low-resource Entity Matching

Is Evaluation Awareness Just Format Sensitivity? Limitations of Probe-Based Evidence under Controlled Prompt Structure

EvidenceRL: Reinforcing Evidence Consistency for Trustworthy Language Models

BEAVER: A Training-Free Hierarchical Prompt Compression Method via Structure-Aware Page Selection

Prune-then-Quantize or Quantize-then-Prune? Understanding the Impact of Compression Order in Joint Model Compression

TherapyGym: Evaluating and Aligning Clinical Fidelity and Safety in Therapy Chatbots

Large-Scale Analysis of Political Propaganda on Moltbook

How Confident Is the First Token? An Uncertainty-Calibrated Prompt Optimization Framework for Large Language Model Classification and Understanding

Multi-Trait Subspace Steering to Reveal the Dark Side of Human-AI Interaction

BenchBrowser -- Collecting Evidence for Evaluating Benchmark Validity

MedForge: Interpretable Medical Deepfake Detection via Forgery-aware Reasoning

Controllable Evidence Selection in Retrieval-Augmented Question Answering via Deterministic Utility Gating

D-Mem: A Dual-Process Memory System for LLM Agents

From Topic to Transition Structure: Unsupervised Concept Discovery at Corpus Scale via Predictive Associative Memory

Learned but Not Expressed: Capability-Expression Dissociation in Large Language Models

UT-ACA: Uncertainty-Triggered Adaptive Context Allocation for Long-Context Inference

DiscoPhon: Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units

Evaluating LLM-Generated Lessons from the Language Learning Students' Perspective: A Short Case Study on Duolingo

Hypothesis-Conditioned Query Rewriting for Decision-Useful Retrieval

Towards Differentiating Between Failures and Domain Shifts in Industrial Data Streams

Quotient Geometry and Persistence-Stable Metrics for Swarm Configurations

AGRI-Fidelity: Evaluating the Reliability of Listenable Explanations for Poultry Disease Detection

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.