Predict, Don't React: Value-Based Safety Forecasting for LLM Streaming
arXiv:2604.03962v1 Announce Type: new Abstract: In many practical LLM deployments, a single guardrail is used for both prompt and response moderation. Prompt moderation operates on …
Pride Kavumba, Koki Wataoka, Huy H. Nguyen, Jiaxuan Li, Masaya Ohagi
39 views