
Hadi Khalaf
@hskhalaf
PhD student @ Harvard SEAS, thinking about alignment, information theory, and the likes
ID: 1888695141298257921
09-02-2025 21:03:47
12 Tweet
12 Followers
23 Following









It is critical for scientific integrity that we trust our measure of progress. The lmarena.ai has become the go-to evaluation for AI progress. Our release today demonstrates the difficulty in maintaining fair evaluations on lmarena.ai, despite best intentions.


