Hadi Khalaf (@hskhalaf) 's Twitter Profile
Hadi Khalaf

@hskhalaf

PhD student @ Harvard SEAS, thinking about alignment, information theory, and the likes

ID: 1888695141298257921

calendar_today09-02-2025 21:03:47

12 Tweet

12 Followers

23 Following

Hadi Khalaf (@hskhalaf) 's Twitter Profile Photo

I used to see llama as a base model in most experiments, now qwen has taken over. Diversity in base models in experiments is much much more valuable than any hyperparam tuning or extra runs!

Hadi Khalaf (@hskhalaf) 's Twitter Profile Photo

On my reading list this week: "the first theoretical result on how to identify the ideal depth for safety alignment... indicating that broader ensembles can compensate for shallower alignments"!!!! arxiv.org/abs/2502.00669

Ai2 (@allen_ai) 's Twitter Profile Photo

Ever wonder how LLM developers choose their pretraining data? It’s not guesswork— all AI labs create small-scale models as experiments, but the models and their data are rarely shared. DataDecide opens up the process: 1,050 models, 30k checkpoints, 25 datasets & 10 benchmarks 🧵

Ever wonder how LLM developers choose their pretraining data? It’s not guesswork— all AI labs create small-scale models as experiments, but the models and their data are rarely shared.
DataDecide opens up the process: 1,050 models, 30k checkpoints, 25 datasets & 10 benchmarks 🧵
Sara Hooker (@sarahookr) 's Twitter Profile Photo

It is critical for scientific integrity that we trust our measure of progress. The lmarena.ai has become the go-to evaluation for AI progress. Our release today demonstrates the difficulty in maintaining fair evaluations on lmarena.ai, despite best intentions.

It is critical for scientific integrity that we trust our measure of progress. 

The <a href="/lmarena_ai/">lmarena.ai</a> has become the go-to evaluation for AI progress.

Our release today demonstrates the difficulty in maintaining fair evaluations on <a href="/lmarena_ai/">lmarena.ai</a>, despite best intentions.
Hadi Khalaf (@hskhalaf) 's Twitter Profile Photo

@ whoever is on the google ai studio team: please fix the chat history never being saved! i cannot access most of my gemini conversations... and this has been an issue since january 🫤

Hadi Khalaf (@hskhalaf) 's Twitter Profile Photo

I judge llms by how bayesian they are #1 gemini 2.5 pro (channeling bayes himself) #2 gpt 5 #3 o3 #4 gpt 4o Please stop the bayesian propaganda