@KaranJanthe
@ExaAILabs
@aiDotEngineer
You get a lot of your recall back by repeatedly oversampling and using less compressed embeddings to rerank. For example, if a user asks for 10 results, you'd first get 10,000 results using 256 binary, then get the top 100 out of those 10,000 using 1024 binary and finally, get