Afra Amini (@afra_amini) 's Twitter Profile
Afra Amini

@afra_amini

Ph.D. student at ETH AI Center, ex-intern @GoogleDeepMind

ID: 1439874627434582017

linkhttps://afraamini.github.io/ calendar_today20-09-2021 08:50:47

40 Tweet

420 Followers

276 Following

Mike S. Schäfer (@mss7676) 's Twitter Profile Photo

How well do different #LargeLanguageModels perform in portraying #climatechange information❓ Paper w/ colleagues from Google DeepMind & ETH Zürich - accepted for ICML Conference, one of the World's leading #machinelearning conferences Link (open access)➡️openreview.net/forum?id=ScIHQ… Thread⬇️

How well do different #LargeLanguageModels perform in portraying #climatechange information❓

Paper w/ colleagues from <a href="/GoogleDeepMind/">Google DeepMind</a>  &amp; <a href="/ETH/">ETH Zürich</a> - accepted for <a href="/icmlconf/">ICML Conference</a>, one of the World's leading #machinelearning conferences

Link (open access)➡️openreview.net/forum?id=ScIHQ…

Thread⬇️
Niklas Stoehr (@niklas_stoehr) 's Twitter Profile Photo

Our new mechanistic interpretability work "Activation Scaling for Steering and Interpreting Language Models" was accepted into Findings of EMNLP 2024! 🔴🔵 📄arxiv.org/pdf/2410.04962 Kevin Du, Vésteinn Snæbjarnarson, Bob West, Ryan Cotterell and Aaron Schein thread 👇

Our new mechanistic interpretability work "Activation Scaling for Steering and Interpreting Language Models" was accepted into Findings of EMNLP 2024! 🔴🔵

📄arxiv.org/pdf/2410.04962

<a href="/kevdududu/">Kevin Du</a>, <a href="/vesteinns/">Vésteinn Snæbjarnarson</a>, <a href="/cervisiarius/">Bob West</a>, Ryan Cotterell and <a href="/AaronSchein/">Aaron Schein</a>

thread 👇
ETH AI Center (@eth_ai_center) 's Twitter Profile Photo

The #ETHAICenter application is now open! Interested in doing research on interdisciplinary AI topics? Join our Fellowship programs: APPLY by 19 November 2024: ai.ethz.ch/apply #PhD #PhDProgram #MachineLearning #AI #BigData #DataScience #DeepLearning #PostDoc

The #ETHAICenter application is now open! 
Interested in doing research on interdisciplinary AI topics? 
Join our Fellowship programs: APPLY by 19 November 2024: ai.ethz.ch/apply
#PhD #PhDProgram #MachineLearning #AI #BigData #DataScience #DeepLearning #PostDoc
Afra Amini (@afra_amini) 's Twitter Profile Photo

It is extremely sad to see ETH recommending students rejection based on four criteria, which in many cases translates into the student's country of origin!

ETH AI Center (@eth_ai_center) 's Twitter Profile Photo

🚨 Only 12 Days Left to Apply for the ETH AI Center Fellowship Programs! 🚨 Don’t miss your chance to be part of Europe’s top AI research hub! ⏰ Apply now! ai.ethz.ch/apply Deadline: 19 Nov, 2024

🚨 Only 12 Days Left to Apply for the ETH AI Center Fellowship Programs! 🚨

Don’t miss your chance to be part of Europe’s top AI research hub!

⏰ Apply now!
ai.ethz.ch/apply
Deadline: 19 Nov, 2024
TheodoraKontogianni (@dorakontog) 's Twitter Profile Photo

🚨 Recruiting PhD students to join my team at DTU Visual Computing & Pioneer Centre for AI on 3D Vision! 🇩🇰 🚲🌊🏰 Copenhagen is an emerging hub for computer vision with a thriving community—and it’s an amazing city! 📢 Apply now: efzu.fa.em2.oraclecloud.com/hcmUI/Candidat… & please reach out with any questions!

Ziteng Sun (@sziteng) 's Twitter Profile Photo

Inference-time procedures (e.g. Best-of-N, CoT) have been instrumental to recent development of LLMs. The standard RLHF framework focuses only on improving the trained model. This creates a train/inference mismatch. Can we align our model to better suit a given inference-time

Inference-time procedures (e.g. Best-of-N, CoT) have been instrumental to recent development of LLMs. The standard RLHF framework focuses only on improving the trained model. This creates a train/inference mismatch.

Can we align our model to better suit a given inference-time
Alice Bizeul (@alicebizeul) 's Twitter Profile Photo

✨New Preprint ✨ Ever thought that reconstructing masked pixels for image representation learning seems sub-optimal? In our new preprint, we show how masking principal components—rather than raw pixel patches— improves Masked Image Modelling (MIM). Find out more below 🧵

✨New Preprint ✨ Ever thought that reconstructing masked pixels for image representation learning seems sub-optimal?

In our new preprint, we show how masking principal components—rather than raw pixel patches— improves Masked Image Modelling (MIM).

Find out more below 🧵
Aarash Feizi (@aarashfeizi) 's Twitter Profile Photo

🚨 Excited to introduce PairBench! 🚨 💡 TL;DR: VLM-judges can fail at data comparison! ✅ PairBench helps you pick the right one by testing alignment, symmetry, smoothness & controllability—ensuring reliable auto-evaluation. 📄Paper: arxiv.org/abs/2502.15210 🧵 Thread: 👇

🚨  Excited to introduce PairBench! 🚨

💡 TL;DR: VLM-judges can fail at data comparison! 

✅ PairBench helps you pick the right one by testing alignment, symmetry, smoothness &amp; controllability—ensuring reliable auto-evaluation.

📄Paper: arxiv.org/abs/2502.15210

🧵  Thread: 👇
Afra Amini (@afra_amini) 's Twitter Profile Photo

Excited to share that this paper has been accepted to ICLR 2025 🎉 We've added more experiments in the camera-ready version: arxiv.org/pdf/2407.06057 Code is available here: github.com/rycolab/vbon

Ben Lipkin (@ben_lipkin) 's Twitter Profile Photo

Many LM applications may be formulated as targeting some (Boolean) constraint. Generate a… - Python program that passes a test suite - PDDL plan that satisfies a goal - CoT trajectory that yields a positive reward The list goes on… How can we efficiently satisfy these? 🧵👇

Saumya Malik (@saumyamalik44) 's Twitter Profile Photo

I’m thrilled to share RewardBench 2 📊— We created a new multi-domain reward model evaluation that is substantially harder than RewardBench, we trained and released 70 reward models, and we gained insights about reward modeling benchmarks and downstream performance!

I’m thrilled to share RewardBench 2 📊— We created a new multi-domain reward model evaluation that is substantially harder than RewardBench, we trained and released 70 reward models, and we gained insights about reward modeling benchmarks and downstream performance!
Valentina Pyatkin (@valentina__py) 's Twitter Profile Photo

💡Beyond math/code, instruction following with verifiable constraints is suitable to be learned with RLVR. But the set of constraints and verifier functions is limited and most models overfit on IFEval. We introduce IFBench to measure model generalization to unseen constraints.

💡Beyond math/code, instruction following with verifiable constraints is suitable to be learned with RLVR.
But the set of constraints and verifier functions is limited and most models overfit on IFEval.
We introduce IFBench to measure model generalization to unseen constraints.