International Law

LOW Academic International

On-Policy Supervised Fine-Tuning for Efficient Reasoning

arXiv:2602.13407v1 Announce Type: new Abstract: Large reasoning models (LRMs) are commonly trained with reinforcement learning (RL) to explore long chain-of-thought reasoning, achieving strong performance at high computational cost. Recent methods add multi-reward objectives to jointly optimize correctness and brevity, but...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

OpAgent: Operator Agent for Web Navigation

arXiv:2602.13559v1 Announce Type: new Abstract: To fulfill user instructions, autonomous web agents must contend with the inherent complexity and volatile nature of real-world websites. Conventional paradigms predominantly rely on Supervised Fine-Tuning (SFT) or Offline Reinforcement Learning (RL) using static datasets....

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Hippocampus: An Efficient and Scalable Memory Module for Agentic AI

arXiv:2602.13594v1 Announce Type: new Abstract: Agentic AI require persistent memory to store user-specific histories beyond the limited context window of LLMs. Existing memory systems use dense vector databases or knowledge-graph traversal (or hybrid), incurring high retrieval latency and poor storage...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Guided Collaboration in Heterogeneous LLM-Based Multi-Agent Systems via Entropy-Based Understanding Assessment and Experience Retrieval

arXiv:2602.13639v1 Announce Type: new Abstract: With recent breakthroughs in large language models (LLMs) for reasoning, planning, and complex task generation, artificial intelligence systems are transitioning from isolated single-agent architectures to multi-agent systems with collaborative intelligence. However, in heterogeneous multi-agent systems...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Building Autonomous GUI Navigation via Agentic-Q Estimation and Step-Wise Policy Optimization

arXiv:2602.13653v1 Announce Type: new Abstract: Recent advances in Multimodal Large Language Models (MLLMs) have substantially driven the progress of autonomous agents for Graphical User Interface (GUI). Nevertheless, in real-world applications, GUI agents are often faced with non-stationary environments, leading to...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

AllMem: A Memory-centric Recipe for Efficient Long-context Modeling

arXiv:2602.13680v1 Announce Type: new Abstract: Large Language Models (LLMs) encounter significant performance bottlenecks in long-sequence tasks due to the computational complexity and memory overhead inherent in the self-attention mechanism. To address these challenges, we introduce \textsc{AllMem}, a novel and efficient...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Using Machine Learning to Enhance the Detection of Obfuscated Abusive Words in Swahili: A Focus on Child Safety

arXiv:2602.13455v1 Announce Type: new Abstract: The rise of digital technology has dramatically increased the potential for cyberbullying and online abuse, necessitating enhanced measures for detection and prevention, especially among children. This study focuses on detecting abusive obfuscated language in Swahili,...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Language Model Memory and Memory Models for Language

arXiv:2602.13466v1 Announce Type: new Abstract: The ability of machine learning models to store input information in hidden layer vector embeddings, analogous to the concept of `memory', is widely employed but not well characterized. We find that language model embeddings typically...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

From Perceptions To Evidence: Detecting AI-Generated Content In Turkish News Media With A Fine-Tuned Bert Classifier

arXiv:2602.13504v1 Announce Type: new Abstract: The rapid integration of large language models into newsroom workflows has raised urgent questions about the prevalence of AI-generated content in online media. While computational studies have begun to quantify this phenomenon in English-language outlets,...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Think Deep, Not Just Long: Measuring LLM Reasoning Effort via Deep-Thinking Tokens

arXiv:2602.13517v1 Announce Type: new Abstract: Large language models (LLMs) have demonstrated impressive reasoning capabilities by scaling test-time compute via long Chain-of-Thought (CoT). However, recent findings suggest that raw token counts are unreliable proxies for reasoning quality: increased generation length does...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Elo-Evolve: A Co-evolutionary Framework for Language Model Alignment

arXiv:2602.13575v1 Announce Type: new Abstract: Current alignment methods for Large Language Models (LLMs) rely on compressing vast amounts of human preference data into static, absolute reward functions, leading to data scarcity, noise sensitivity, and training instability. We introduce Elo-Evolve, a...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Metaphors' journeys across time and genre: tracking the evolution of literary metaphors with temporal embeddings

arXiv:2602.13701v1 Announce Type: new Abstract: Metaphors are a distinctive feature of literary language, yet they remain less studied experimentally than everyday metaphors. Moreover, previous psycholinguistic and computational approaches overlooked the temporal dimension, although many literary metaphors were coined centuries apart...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

On Theoretically-Driven LLM Agents for Multi-Dimensional Discourse Analysis

arXiv:2602.13713v1 Announce Type: new Abstract: Identifying the strategic uses of reformulation in discourse remains a key challenge for computational argumentation. While LLMs can detect surface-level similarity, they often fail to capture the pragmatic functions of rephrasing, such as its role...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

RMPL: Relation-aware Multi-task Progressive Learning with Stage-wise Training for Multimedia Event Extraction

arXiv:2602.13748v1 Announce Type: new Abstract: Multimedia Event Extraction (MEE) aims to identify events and their arguments from documents that contain both text and images. It requires grounding event semantics across different modalities. Progress in MEE is limited by the lack...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Beyond Words: Evaluating and Bridging Epistemic Divergence in User-Agent Interaction via Theory of Mind

arXiv:2602.13832v1 Announce Type: new Abstract: Large Language Models (LLMs) have developed rapidly and are widely applied to both general-purpose and professional tasks to assist human users. However, they still struggle to comprehend and respond to the true user needs when...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

PrivAct: Internalizing Contextual Privacy Preservation via Multi-Agent Preference Training

arXiv:2602.13840v1 Announce Type: new Abstract: Large language model (LLM) agents are increasingly deployed in personalized tasks involving sensitive, context-dependent information, where privacy violations may arise in agents' action due to the implicitness of contextual privacy. Existing approaches rely on external,...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Tutoring Large Language Models to be Domain-adaptive, Precise, and Safe

arXiv:2602.13860v1 Announce Type: new Abstract: The overarching research direction of this work is the development of a ''Responsible Intelligence'' framework designed to reconcile the immense generative power of Large Language Models (LLMs) with the stringent requirements of real-world deployment. As...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Bridging the Multilingual Safety Divide: Efficient, Culturally-Aware Alignment for Global South Languages

arXiv:2602.13867v1 Announce Type: new Abstract: Large language models (LLMs) are being deployed across the Global South, where everyday use involves low-resource languages, code-mixing, and culturally specific norms. Yet safety pipelines, benchmarks, and alignment still largely target English and a handful...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Evaluating Prompt Engineering Techniques for RAG in Small Language Models: A Multi-Hop QA Approach

arXiv:2602.13890v1 Announce Type: new Abstract: Retrieval Augmented Generation (RAG) is a powerful approach for enhancing the factual grounding of language models by integrating external knowledge. While widely studied for large language models, the optimization of RAG for Small Language Models...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Context Shapes LLMs Retrieval-Augmented Fact-Checking Effectiveness

arXiv:2602.14044v1 Announce Type: new Abstract: Large language models (LLMs) show strong reasoning abilities across diverse tasks, yet their performance on extended contexts remains inconsistent. While prior research has emphasized mid-context degradation in question answering, this study examines the impact of...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

LogitsCoder: Towards Efficient Chain-of-Thought Path Search via Logits Preference Decoding for Code Generation

arXiv:2602.14054v1 Announce Type: new Abstract: Code generation remains a challenging task that requires precise and structured reasoning. Existing Test Time Scaling (TTS) methods, including structured tree search, have made progress in exploring reasoning paths but still face two major challenges:...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

ResearchGym: Evaluating Language Model Agents on Real-World AI Research

arXiv:2602.15112v1 Announce Type: new Abstract: We introduce ResearchGym, a benchmark and execution environment for evaluating AI agents on end-to-end research. To instantiate this, we repurpose five oral and spotlight papers from ICML, ICLR, and ACL. From each paper's repository, we...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Panini: Continual Learning in Token Space via Structured Memory

arXiv:2602.15156v1 Announce Type: new Abstract: Language models are increasingly used to reason over content they were not trained on, such as new documents, evolving knowledge, and user-specific data. A common approach is retrieval-augmented generation (RAG), which stores verbatim documents externally...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

da Costa and Tarski meet Goguen and Carnap: a novel approach for ontological heterogeneity based on consequence systems

arXiv:2602.15158v1 Announce Type: new Abstract: This paper presents a novel approach for ontological heterogeneity that draws heavily from Carnapian-Goguenism, as presented by Kutz, Mossakowski and L\"ucke (2010). The approach is provisionally designated da Costian-Tarskianism, named after da Costa's Principle of...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Predicting Invoice Dilution in Supply Chain Finance with Leakage Free Two Stage XGBoost, KAN (Kolmogorov Arnold Networks), and Ensemble Models

arXiv:2602.15248v1 Announce Type: new Abstract: Invoice or payment dilution is the gap between the approved invoice amount and the actual collection is a significant source of non credit risk and margin loss in supply chain finance. Traditionally, this risk is...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Epistemic Traps: Rational Misalignment Driven by Model Misspecification

arXiv:2602.17676v1 Announce Type: new Abstract: The rapid deployment of Large Language Models and AI agents across critical societal and technical domains is hindered by persistent behavioral pathologies including sycophancy, hallucination, and strategic deception that resist mitigation via reinforcement learning. Current...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Cross-Embodiment Offline Reinforcement Learning for Heterogeneous Robot Datasets

arXiv:2602.18025v1 Announce Type: new Abstract: Scalable robot policy pre-training has been hindered by the high cost of collecting high-quality demonstrations for each platform. In this study, we address this issue by uniting offline reinforcement learning (offline RL) with cross-embodiment learning....

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Diffusing to Coordinate: Efficient Online Multi-Agent Diffusion Policies

arXiv:2602.18291v1 Announce Type: new Abstract: Online Multi-Agent Reinforcement Learning (MARL) is a prominent framework for efficient agent coordination. Crucially, enhancing policy expressiveness is pivotal for achieving superior performance. Diffusion-based generative models are well-positioned to meet this demand, having demonstrated remarkable...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

AI Hallucination from Students' Perspective: A Thematic Analysis

arXiv:2602.17671v1 Announce Type: cross Abstract: As students increasingly rely on large language models, hallucinations pose a growing threat to learning. To mitigate this, AI literacy must expand beyond prompt engineering to address how students should detect and respond to LLM...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models

arXiv:2602.17684v1 Announce Type: cross Abstract: Reinforcement Learning from Verifiable Rewards (RLVR) has driven recent progress in code large language models by leveraging execution-based feedback from unit tests, but its scalability is fundamentally constrained by the availability and reliability of high-quality...

1 min 1 month, 2 weeks ago

ear

On-Policy Supervised Fine-Tuning for Efficient Reasoning

OpAgent: Operator Agent for Web Navigation

Hippocampus: An Efficient and Scalable Memory Module for Agentic AI

Guided Collaboration in Heterogeneous LLM-Based Multi-Agent Systems via Entropy-Based Understanding Assessment and Experience Retrieval

Building Autonomous GUI Navigation via Agentic-Q Estimation and Step-Wise Policy Optimization

AllMem: A Memory-centric Recipe for Efficient Long-context Modeling

Using Machine Learning to Enhance the Detection of Obfuscated Abusive Words in Swahili: A Focus on Child Safety

Language Model Memory and Memory Models for Language

From Perceptions To Evidence: Detecting AI-Generated Content In Turkish News Media With A Fine-Tuned Bert Classifier

Think Deep, Not Just Long: Measuring LLM Reasoning Effort via Deep-Thinking Tokens

Elo-Evolve: A Co-evolutionary Framework for Language Model Alignment

Metaphors' journeys across time and genre: tracking the evolution of literary metaphors with temporal embeddings

On Theoretically-Driven LLM Agents for Multi-Dimensional Discourse Analysis

RMPL: Relation-aware Multi-task Progressive Learning with Stage-wise Training for Multimedia Event Extraction

Beyond Words: Evaluating and Bridging Epistemic Divergence in User-Agent Interaction via Theory of Mind

PrivAct: Internalizing Contextual Privacy Preservation via Multi-Agent Preference Training

Tutoring Large Language Models to be Domain-adaptive, Precise, and Safe

Bridging the Multilingual Safety Divide: Efficient, Culturally-Aware Alignment for Global South Languages

Evaluating Prompt Engineering Techniques for RAG in Small Language Models: A Multi-Hop QA Approach

Context Shapes LLMs Retrieval-Augmented Fact-Checking Effectiveness

LogitsCoder: Towards Efficient Chain-of-Thought Path Search via Logits Preference Decoding for Code Generation

ResearchGym: Evaluating Language Model Agents on Real-World AI Research

Panini: Continual Learning in Token Space via Structured Memory

da Costa and Tarski meet Goguen and Carnap: a novel approach for ontological heterogeneity based on consequence systems

Predicting Invoice Dilution in Supply Chain Finance with Leakage Free Two Stage XGBoost, KAN (Kolmogorov Arnold Networks), and Ensemble Models

Epistemic Traps: Rational Misalignment Driven by Model Misspecification

Cross-Embodiment Offline Reinforcement Learning for Heterogeneous Robot Datasets

Diffusing to Coordinate: Efficient Online Multi-Agent Diffusion Policies

AI Hallucination from Students' Perspective: A Thematic Analysis

CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.