WS-GRPO: Weakly-Supervised Group-Relative Policy Optimization Under review at ICML 2026 View Paper arXiv Weakly-Supervised Group-Relative Policy Optimization (WS-GRPO) applies GRPO for variational inference in AI-Agentic benchmarks and LLM-based topic modeling evaluation frameworks.Share on Twitter Facebook LinkedIn Previous Next