@StephenLCasper
See 5.1.2 + 5.1.3. In a pure denoising regime, you would see student-gt agreement>student-supervisor agreement, and positive student-supervisor agreement scaling with student compute. So we're at least partially in debiasing regime. We probably should've discussed this!