Academic

Academic

Academic · 1 min

Knowledge Distillation for Large Language Models

arXiv:2603.13765v1 Announce Type: new Abstract: We propose a resource-efficient framework for compressing large language models through knowledge distillation, combined with guided chain-of-thought reinforcement learning. Using …

Alejandro Paredes La Torre, Barbara Flores, Diego Rodriguez
12 views
Academic · 1 min

LiveWeb-IE: A Benchmark For Online Web Information Extraction

arXiv:2603.13773v1 Announce Type: new Abstract: Web information extraction (WIE) is the task of automatically extracting data from web pages, offering high utility for various applications. …

Seungbin Yang, Jihwan Kim, Jaemin Choi, Dongjin Kim, Soyoung Yang, ChaeHun Park, Jaegul Choo
13 views
Academic · 1 min

Learning When to Trust in Contextual Bandits

arXiv:2603.13356v1 Announce Type: new Abstract: Standard approaches to Robust Reinforcement Learning assume that feedback sources are either globally trustworthy or globally adversarial. In this paper, …

Majid Ghasemi, Mark Crowley
4 views