Lujain Ibrahim لجين إبراهيم (@lujainmibrahim) 's Twitter Profile
Lujain Ibrahim لجين إبراهيم

@lujainmibrahim

Working on AI evaluations & societal impact / PhD candidate @oiioxford / formerly intern @googledeepmind @govai_ @schwarzmanorg @nyuniversity

ID: 1155880668435496960

linkhttp://lujainibrahim.com calendar_today29-07-2019 16:40:01

2,2K Tweet

1,1K Followers

858 Following

Markus Anderljung (@manderljung) 's Twitter Profile Photo

What the public thinks about AI really matters. Dr Noemi Dreksler and colleagues recently put together the most comprehensive review of the literature out there.

Cas (Stephen Casper) (@stephenlcasper) 's Twitter Profile Photo

🚨New paper led by Ariba Khan Lots of prior research has assumed that LLMs have stable preferences, align with coherent principles, or can be steered to represent specific worldviews. No ❌, no ❌, and definitely no ❌. We need to be careful not to anthropomorphize LLMs too much.

🚨New paper led by <a href="/aribak02/">Ariba Khan</a>

Lots of prior research has assumed that LLMs have stable preferences, align with coherent principles, or can be steered to represent specific worldviews. No ❌, no ❌, and definitely no ❌. We need to be careful not to anthropomorphize LLMs too much.
Taylor Sorensen (@ma_tay_) 's Twitter Profile Photo

🤔🤖Most AI systems assume there’s just one right answer—but many tasks have reasonable disagreement. How can we better model human variation? 🌍✨ We propose modeling at the individual-level using open-ended, textual value profiles! 🗣️📝 arxiv.org/abs/2503.15484 (1/?)

🤔🤖Most AI systems assume there’s just one right answer—but many tasks have reasonable disagreement. How can we better model human variation? 🌍✨

We propose modeling at the individual-level using open-ended, textual value profiles! 🗣️📝

arxiv.org/abs/2503.15484
(1/?)
Iason Gabriel (@iasongabriel) 's Twitter Profile Photo

🚨Excited to share my new paper with Geoff Keeling, ‘A matter of principle? AI alignment as the fair treatment of claims’ published in Philosophical Studies today! 🙌

🚨Excited to share my new paper with Geoff Keeling, ‘A matter of principle? AI alignment as the fair treatment of claims’ published in Philosophical Studies today! 🙌
Technical AI Governance @ ICML 2025 (@taig_icml) 's Twitter Profile Photo

📣 We’re thrilled to announce the first workshop on Technical AI Governance (TAIG) at #ICML2025 this July in Vancouver! Join us (& this stellar list of speakers) in bringing together technical & policy experts to shape the future of AI governance!

📣 We’re thrilled to announce the first workshop on Technical AI Governance (TAIG) at #ICML2025 this July in Vancouver! Join us (&amp; this stellar list of speakers) in bringing together technical &amp; policy experts to shape the future of AI governance!
Jacy Reese Anthis (@jacyanthis) 's Twitter Profile Photo

Should we use LLMs 🤖 to simulate human research subjects 🧑? In our new preprint, we argue sims can augment human studies to scale up social science as AI technology accelerates. We identify five tractable challenges and argue this is a promising and underused research method 🧵

Should we use LLMs 🤖 to simulate human research subjects 🧑? In our new preprint, we argue sims can augment human studies to scale up social science as AI technology accelerates. We identify five tractable challenges and argue this is a promising and underused research method 🧵
Ben Bucknall (@ben_s_bucknall) 's Twitter Profile Photo

📢 Over the moon that Open Problems in Technical AI Governance has now been published at TMLR! See the updated version here: shorturl.at/joQJS

📢 Over the moon that Open Problems in Technical AI Governance has now been published at TMLR! See the updated version here: shorturl.at/joQJS
Hugo Larochelle (@hugo_larochelle) 's Twitter Profile Photo

Consciousness is a fascinating topic. But personally, I'd rather resources be directed towards preventing (human) harms coming from people mistakenly believing an AI system is conscious.

Centre for the Governance of AI (GovAI) (@govai_) 's Twitter Profile Photo

Apply for GovAI’s DC Fellowship! Fellows will join GovAI in Washington, DC for 3 months to conduct paid research on a topic of their choice, with mentorship from leading experts in the field of AI policy. Application Deadline: May 25, 2025 at 23:59 ET. governance.ai/post/dc-fellow…

Myra Cheng (@chengmyra1) 's Twitter Profile Photo

Dear ChatGPT, Am I the Asshole? While Reddit users might say yes, your favorite LLM probably won’t. We present Social Sycophancy: a new way to understand and measure sycophancy as how LLMs overly preserve users' self-image.

Dear ChatGPT, Am I the Asshole?
While Reddit users might say yes, your favorite LLM probably won’t.
We present Social Sycophancy: a new way to understand and measure sycophancy as how LLMs overly preserve users' self-image.
Charvi Rastogi (@charvvvv_) 's Twitter Profile Photo

Join MLCommons for a social at ACM FAccT 2025, where we're tackling the critical need for a unified and collective approach to AI safety. AI safety research is siloed, hindering the development of safe and robust AI systems that work for everyone.

Alan Chan (@_achan96_) 's Twitter Profile Photo

New blog post! AI agents are becoming increasingly capable, but will need new protocols and systems in order to work effectively and safely. Who should build such protocols and systems?

New blog post!

AI agents are becoming increasingly capable, but will need new protocols and systems in order to work effectively and safely.

Who should build such protocols and systems?
Jamie Bernardi (@the_jbernardi) 's Twitter Profile Photo

Important work. Non-Claude models seem to refuse reasoning about alignment faking, and have less intrinsic tendency for goal-guarding. Observing this diff is a step towards better aligning AI. I'm in awe that 2025 is seeing alignment become an increasingly empirical discipline!

Important work. Non-Claude models seem to refuse reasoning about alignment faking, and have less intrinsic tendency for goal-guarding. Observing this diff is a step towards better aligning AI.

I'm in awe that 2025 is seeing alignment become an increasingly empirical discipline!
Arvind Narayanan (@random_walker) 's Twitter Profile Photo

I wish data centers would offer tours to the public and schools could take field trips to them. They are the defining pieces of infrastructure of our generation, but unlike railroads, the grid, or anything else, we never get to see them and experience their scale.

Emerging Technology Observatory (@emergingtechobs) 's Twitter Profile Photo

🤩🤩🤩Saad Siddiqui and Lujain Ibrahim لجين إبراهيم adapted AGORA's taxonomy to compare US and Chinese documents on AI risk: "...despite strategic competition, there exist concrete opportunities for bilateral U.S. China cooperation in the development of responsible AI." 🔗🧵

🤩🤩🤩<a href="/Saad97Siddiqui/">Saad Siddiqui</a> and <a href="/lujainmibrahim/">Lujain Ibrahim لجين إبراهيم</a> adapted AGORA's taxonomy to compare US and Chinese documents on AI risk: "...despite strategic competition, there exist concrete opportunities for bilateral U.S. China cooperation in the development of responsible AI." 🔗🧵
Neil Rathi (@neil_rathi) 's Twitter Profile Photo

new paper 🌟 interpretation of uncertainty expressions like "i think" differs cross-linguistically. we show that (1) llms are sensitive to these differences but (2) humans overrely on their outputs across languages

new paper 🌟

interpretation of uncertainty expressions like "i think" differs cross-linguistically. we show that (1) llms are sensitive to these differences but (2) humans overrely on their outputs across languages