Intellectual Property

LOW Academic United States

RadAnnotate: Large Language Models for Efficient and Reliable Radiology Report Annotation

arXiv:2603.16002v1 Announce Type: new Abstract: Radiology report annotation is essential for clinical NLP, yet manual labeling is slow and costly. We present RadAnnotate, an LLM-based framework that studies retrieval-augmented synthetic reports and confidence-based selective automation to reduce expert effort for...

1 min 1 month ago

nda

LOW Academic United States

Understanding Moral Reasoning Trajectories in Large Language Models: Toward Probing-Based Explainability

arXiv:2603.16017v1 Announce Type: new Abstract: Large language models (LLMs) increasingly participate in morally sensitive decision-making, yet how they organize ethical frameworks across reasoning steps remains underexplored. We introduce \textit{moral reasoning trajectories}, sequences of ethical framework invocations across intermediate reasoning steps,...

1 min 1 month ago

ip

LOW Academic International

SEAHateCheck: Functional Tests for Detecting Hate Speech in Low-Resource Languages of Southeast Asia

arXiv:2603.16070v1 Announce Type: new Abstract: Hate speech detection relies heavily on linguistic resources, which are primarily available in high-resource languages such as English and Chinese, creating barriers for researchers and platforms developing tools for low-resource languages in Southeast Asia, where...

1 min 1 month ago

ip

LOW Academic European Union

ClaimFlow: Tracing the Evolution of Scientific Claims in NLP

arXiv:2603.16073v1 Announce Type: new Abstract: Scientific papers do more than report results $-$ they advance $\textit{claims}$ that later work supports, extends, or sometimes refutes. Yet existing methods for citation and claim analysis capture only fragments of this dialogue. In this...

1 min 1 month ago

nda

LOW Academic United States

CounterRefine: Answer-Conditioned Counterevidence Retrieval for Inference-Time Knowledge Repair in Factual Question Answering

arXiv:2603.16091v1 Announce Type: new Abstract: In factual question answering, many errors are not failures of access but failures of commitment: the system retrieves relevant evidence, yet still settles on the wrong answer. We present CounterRefine, a lightweight inference-time repair layer...

1 min 1 month ago

nda

LOW Academic United States

ASDA: Automated Skill Distillation and Adaptation for Financial Reasoning

arXiv:2603.16112v1 Announce Type: new Abstract: Adapting large language models (LLMs) to specialized financial reasoning typically requires expensive fine-tuning that produces model-locked expertise. Training-free alternatives have emerged, yet our experiments show that leading methods (GEPA and ACE) achieve only marginal gains...

1 min 1 month ago

nda

LOW Academic United States

Language Models Don't Know What You Want: Evaluating Personalization in Deep Research Needs Real Users

arXiv:2603.16120v1 Announce Type: new Abstract: Deep Research (DR) tools (e.g. OpenAI DR) help researchers cope with ballooning publishing counts. Such tools can synthesize scientific papers to answer researchers' queries, but lack understanding of their users. We change that in MyScholarQA...

1 min 1 month ago

nda

LOW Academic United States

SIA: A Synthesize-Inject-Align Framework for Knowledge-Grounded and Secure E-commerce Search LLMs with Industrial Deployment

arXiv:2603.16137v1 Announce Type: new Abstract: Large language models offer transformative potential for e-commerce search by enabling intent-aware recommendations. However, their industrial deployment is hindered by two critical challenges: (1) knowledge hallucination due to insufficient encoding of dynamic, fine-grained product knowledge,...

1 min 1 month ago

nda

LOW Academic International

Parametric Social Identity Injection and Diversification in Public Opinion Simulation

arXiv:2603.16142v1 Announce Type: new Abstract: Large language models (LLMs) have recently been adopted as synthetic agents for public opinion simulation, offering a promising alternative to costly and slow human surveys. Despite their scalability, current LLM-based simulation methods fail to capture...

1 min 1 month ago

ip

LOW Academic International

Polyglot-Lion: Efficient Multilingual ASR for Singapore via Balanced Fine-Tuning of Qwen3-ASR

arXiv:2603.16184v1 Announce Type: new Abstract: We present Polyglot-Lion, a family of compact multilingual automatic speech recognition (ASR) models tailored for the linguistic landscape of Singapore, covering English, Mandarin, Tamil, and Malay. Our models are obtained by fine-tuning Qwen3-ASR-0.6B and Qwen3-ASR-1.7B...

1 min 1 month ago

nda

LOW Academic International

Structured Semantic Cloaking for Jailbreak Attacks on Large Language Models

arXiv:2603.16192v1 Announce Type: new Abstract: Modern LLMs employ safety mechanisms that extend beyond surface-level input filtering to latent semantic representations and generation-time reasoning, enabling them to recover obfuscated malicious intent during inference and refuse accordingly, and rendering many surface-level obfuscation...

1 min 1 month ago

ip

LOW Academic International

Is Semi-Automatic Transcription Useful in Corpus Creation? Preliminary Considerations on the KIParla Corpus

arXiv:2603.16258v1 Announce Type: new Abstract: This paper analyses the implementation of Automatic Speech Recognition (ASR) into the transcription workflow of the KIParla corpus, a resource of spoken Italian. Through a two-phase experiment, 11 expert and novice transcribers produced both manual...

1 min 1 month ago

ip

LOW Academic European Union

PyPhonPlan: Simulating phonetic planning with dynamic neural fields and task dynamics

arXiv:2603.16299v1 Announce Type: new Abstract: We introduce PyPhonPlan, a Python toolkit for implementing dynamical models of phonetic planning using coupled dynamic neural fields and task dynamic simulations. The toolkit provides modular components for defining planning, perception and memory fields, as...

1 min 1 month ago

ip

LOW Academic International

PashtoCorp: A 1.25-Billion-Word Corpus, Evaluation Suite, and Reproducible Pipeline for Low-Resource Language Development

arXiv:2603.16354v1 Announce Type: new Abstract: We present PashtoCorp, a 1.25-billion-word corpus for Pashto, a language spoken by 60 million people that remains severely underrepresented in NLP. The corpus is assembled from 39 sources spanning seven HuggingFace datasets and 32 purpose-built...

1 min 1 month ago

ip

LOW Academic International

RECOVER: Robust Entity Correction via agentic Orchestration of hypothesis Variants for Evidence-based Recovery

arXiv:2603.16411v1 Announce Type: new Abstract: Entity recognition in Automatic Speech Recognition (ASR) is challenging for rare and domain-specific terms. In domains such as finance, medicine, and air traffic control, these errors are costly. If the entities are entirely absent from...

1 min 1 month ago

ip

LOW Academic International

IndexRAG: Bridging Facts for Cross-Document Reasoning at Index Time

arXiv:2603.16415v1 Announce Type: new Abstract: Multi-hop question answering (QA) requires reasoning across multiple documents, yet existing retrieval-augmented generation (RAG) approaches address this either through graph-based methods requiring additional online processing or iterative multi-step reasoning. We present IndexRAG, a novel approach...

1 min 1 month ago

ip

LOW Academic United States

DynHD: Hallucination Detection for Diffusion Large Language Models via Denoising Dynamics Deviation Learning

arXiv:2603.16459v1 Announce Type: new Abstract: Diffusion large language models (D-LLMs) have emerged as a promising alternative to auto-regressive models due to their iterative refinement capabilities. However, hallucinations remain a critical issue that hinders their reliability. To detect hallucination responses from...

1 min 1 month ago

ip

LOW Academic International

On the Emotion Understanding of Synthesized Speech

arXiv:2603.16483v1 Announce Type: new Abstract: Emotion is a core paralinguistic feature in voice interaction. It is widely believed that emotion understanding models learn fundamental representations that transfer to synthesized speech, making emotion understanding results a plausible reward or evaluation metric...

1 min 1 month ago

nda

LOW Academic International

AdaMem: Adaptive User-Centric Memory for Long-Horizon Dialogue Agents

arXiv:2603.16496v1 Announce Type: new Abstract: Large language model (LLM) agents increasingly rely on external memory to support long-horizon interaction, personalized assistance, and multi-step reasoning. However, existing memory systems still face three core challenges: they often rely too heavily on semantic...

1 min 1 month ago

ip

LOW Academic International

How often do Answers Change? Estimating Recency Requirements in Question Answering

arXiv:2603.16544v1 Announce Type: new Abstract: Large language models (LLMs) often rely on outdated knowledge when answering time-sensitive questions, leading to confident yet incorrect responses. Without explicit signals indicating whether up-to-date information is required, models struggle to decide when to retrieve...

1 min 1 month ago

nda

LOW Academic United States

How to Achieve Prototypical Birth and Death for OOD Detection?

arXiv:2603.15650v1 Announce Type: new Abstract: Out-of-Distribution (OOD) detection is crucial for the secure deployment of machine learning models, and prototype-based learning methods are among the mainstream strategies for achieving OOD detection. Existing prototype-based learning methods generally rely on a fixed...

1 min 1 month ago

nda

LOW Academic International

Discovering the Hidden Role of Gini Index In Prompt-based Classification

arXiv:2603.15654v1 Announce Type: new Abstract: In classification tasks, the long-tailed minority classes usually offer the predictions that are most important. Yet these classes consistently exhibit low accuracies, whereas a few high-performing classes dominate the game. We pursue a foundational understanding...

1 min 1 month ago

nda

LOW Academic International

Transition Flow Matching

arXiv:2603.15689v1 Announce Type: new Abstract: Mainstream flow matching methods typically focus on learning the local velocity field, which inherently requires multiple integration steps during generation. In contrast, Mean Velocity Flow models establish a relationship between the local velocity field and...

1 min 1 month ago

ip

LOW Academic European Union

Tackling Over-smoothing on Hypergraphs: A Ricci Flow-guided Neural Diffusion Approach

arXiv:2603.15696v1 Announce Type: new Abstract: Hypergraph neural networks (HGNNs) have demonstrated strong capabilities in modeling complex higher-order relationships. However, existing HGNNs often suffer from over-smoothing as the number of layers increases and lack effective control over message passing among nodes....

1 min 1 month ago

ip

LOW Academic United States

Mastering the Minority: An Uncertainty-guided Multi-Expert Framework for Challenging-tailed Sequence Learning

arXiv:2603.15708v1 Announce Type: new Abstract: Imbalanced data distribution remains a critical challenge in sequential learning, leading models to easily recognize frequent categories while failing to detect minority classes adequately. The Mixture-of-Experts model offers a scalable solution, yet its application is...

1 min 1 month ago

ip

LOW Academic International

Embedding-Aware Feature Discovery: Bridging Latent Representations and Interpretable Features in Event Sequences

arXiv:2603.15713v1 Announce Type: new Abstract: Industrial financial systems operate on temporal event sequences such as transactions, user actions, and system logs. While recent research emphasizes representation learning and large language models, production systems continue to rely heavily on handcrafted statistical...

1 min 1 month ago

ip

LOW Academic International

Meta-TTRL: A Metacognitive Framework for Self-Improving Test-Time Reinforcement Learning in Unified Multimodal Models

arXiv:2603.15724v1 Announce Type: new Abstract: Existing test-time scaling (TTS) methods for unified multimodal models (UMMs) in text-to-image (T2I) generation primarily rely on search or sampling strategies that produce only instance-level improvements, limiting the ability to learn from prior inferences and...

1 min 1 month ago

ip

LOW Academic International

Mask Is What DLLM Needs: A Masked Data Training Paradigm for Diffusion LLMs

arXiv:2603.15803v1 Announce Type: new Abstract: Discrete diffusion models offer global context awareness and flexible parallel generation. However, uniform random noise schedulers in standard DLLM training overlook the highly non-uniform information density inherent in real-world sequences. This wastes optimization resources on...

1 min 1 month ago

nda

LOW Academic European Union

Hypothesis Class Determines Explanation: Why Accurate Models Disagree on Feature Attribution

arXiv:2603.15821v1 Announce Type: new Abstract: The assumption that prediction-equivalent models produce equivalent explanations underlies many practices in explainable AI, including model selection, auditing, and regulatory evaluation. In this work, we show that this assumption does not hold. Through a large-scale...

1 min 1 month ago

ip

LOW Academic International

FlashSampling: Fast and Memory-Efficient Exact Sampling

arXiv:2603.15854v1 Announce Type: new Abstract: Sampling from a categorical distribution is mathematically simple, but in large-vocabulary decoding, it often triggers extra memory traffic and extra kernels after the LM head. We present FlashSampling, an exact sampling primitive that fuses sampling...

1 min 1 month ago

ip

RadAnnotate: Large Language Models for Efficient and Reliable Radiology Report Annotation

Understanding Moral Reasoning Trajectories in Large Language Models: Toward Probing-Based Explainability

SEAHateCheck: Functional Tests for Detecting Hate Speech in Low-Resource Languages of Southeast Asia

ClaimFlow: Tracing the Evolution of Scientific Claims in NLP

CounterRefine: Answer-Conditioned Counterevidence Retrieval for Inference-Time Knowledge Repair in Factual Question Answering

ASDA: Automated Skill Distillation and Adaptation for Financial Reasoning

Language Models Don't Know What You Want: Evaluating Personalization in Deep Research Needs Real Users

SIA: A Synthesize-Inject-Align Framework for Knowledge-Grounded and Secure E-commerce Search LLMs with Industrial Deployment

Parametric Social Identity Injection and Diversification in Public Opinion Simulation

Polyglot-Lion: Efficient Multilingual ASR for Singapore via Balanced Fine-Tuning of Qwen3-ASR

Structured Semantic Cloaking for Jailbreak Attacks on Large Language Models

Is Semi-Automatic Transcription Useful in Corpus Creation? Preliminary Considerations on the KIParla Corpus

PyPhonPlan: Simulating phonetic planning with dynamic neural fields and task dynamics

PashtoCorp: A 1.25-Billion-Word Corpus, Evaluation Suite, and Reproducible Pipeline for Low-Resource Language Development

RECOVER: Robust Entity Correction via agentic Orchestration of hypothesis Variants for Evidence-based Recovery

IndexRAG: Bridging Facts for Cross-Document Reasoning at Index Time

DynHD: Hallucination Detection for Diffusion Large Language Models via Denoising Dynamics Deviation Learning

On the Emotion Understanding of Synthesized Speech

AdaMem: Adaptive User-Centric Memory for Long-Horizon Dialogue Agents

How often do Answers Change? Estimating Recency Requirements in Question Answering

How to Achieve Prototypical Birth and Death for OOD Detection?

Discovering the Hidden Role of Gini Index In Prompt-based Classification

Transition Flow Matching

Tackling Over-smoothing on Hypergraphs: A Ricci Flow-guided Neural Diffusion Approach

Mastering the Minority: An Uncertainty-guided Multi-Expert Framework for Challenging-tailed Sequence Learning

Embedding-Aware Feature Discovery: Bridging Latent Representations and Interpretable Features in Event Sequences

Meta-TTRL: A Metacognitive Framework for Self-Improving Test-Time Reinforcement Learning in Unified Multimodal Models

Mask Is What DLLM Needs: A Masked Data Training Paradigm for Diffusion LLMs

Hypothesis Class Determines Explanation: Why Accurate Models Disagree on Feature Attribution

FlashSampling: Fast and Memory-Efficient Exact Sampling

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.