@lmsysorg
I was playing around with evolving benchmark questions with
@WizardLM_AI
Evol-Instruct method a while back, and got some interesting results, especially with Breadth evolution. Less cheating than rephrasing, but not fair either - how far can we push this line 😏