No Memorization, No Detection: Output Distribution-Based Contamination Detection in Small Language Models
arXiv:2603.03203v1 Announce Type: new Abstract: CDD, or Contamination Detection via output Distribution, identifies data contamination by measuring the peakedness of a model's sampled outputs. We …
Omer Sela
3 views