2025

An Alternative to LLM-based Reward Models