Svelte Hacker News logo
  • top
  • new
  • show
  • ask
  • jobs
  • about

OpenAI: Investigating the consequences of accidentally grading CoT during RL

alignment.openai.com

2 points by pretext 11 hours ago