Svelte Hacker News logo
  • top
  • new
  • show
  • ask
  • jobs
  • about

Why LLM-as-judge fails for code evaluation. Here's what works.

navigara.medium.com

2 points by alienll 7 hours ago