Lujain Ibrahim لجين إبراهيم (@lujainmibrahim) Twitter Tweets • TwiCopy

Lujain Ibrahim لجين إبراهيم

@lujainmibrahim

+ Follow

Working on AI evaluations & societal impact / PhD candidate @oiioxford / formerly intern @googledeepmind @govai_ @schwarzmanorg @nyuniversity

ID: 1155880668435496960

linkhttp://lujainibrahim.com calendar_today29-07-2019 16:40:01

2,2K Tweet

1,1K Followers

858 Following

Markus Anderljung

@manderljung

7 months ago

What the public thinks about AI really matters. Dr Noemi Dreksler and colleagues recently put together the most comprehensive review of the literature out there.

thumb_up_off_alt15

chat_bubble_outline0

repeat1

shareShare

🚨New paper led by Ariba Khan Lots of prior research has assumed that LLMs have stable preferences, align with coherent principles, or can be steered to represent specific worldviews. No ❌, no ❌, and definitely no ❌. We need to be careful not to anthropomorphize LLMs too much.

🚨New paper led by <a href="/aribak02/">Ariba Khan</a>

Lots of prior research has assumed that LLMs have stable preferences, align with coherent principles, or can be steered to represent specific worldviews. No ❌, no ❌, and definitely no ❌. We need to be careful not to anthropomorphize LLMs too much.

thumb_up_off_alt389

chat_bubble_outline11

repeat92

shareShare

Taylor Sorensen

@ma_tay_

6 months ago

🤔🤖Most AI systems assume there’s just one right answer—but many tasks have reasonable disagreement. How can we better model human variation? 🌍✨ We propose modeling at the individual-level using open-ended, textual value profiles! 🗣️📝 arxiv.org/abs/2503.15484 (1/?)

thumb_up_off_alt150

chat_bubble_outline3

repeat32

shareShare

Iason Gabriel

@iasongabriel

6 months ago

🚨Excited to share my new paper with Geoff Keeling, ‘A matter of principle? AI alignment as the fair treatment of claims’ published in Philosophical Studies today! 🙌

thumb_up_off_alt53

chat_bubble_outline1

repeat14

shareShare

Technical AI Governance @ ICML 2025

@taig_icml

6 months ago

📣 We’re thrilled to announce the first workshop on Technical AI Governance (TAIG) at #ICML2025 this July in Vancouver! Join us (& this stellar list of speakers) in bringing together technical & policy experts to shape the future of AI governance!

thumb_up_off_alt54

chat_bubble_outline1

repeat25

shareShare

Jacy Reese Anthis

@jacyanthis

6 months ago

Should we use LLMs 🤖 to simulate human research subjects 🧑? In our new preprint, we argue sims can augment human studies to scale up social science as AI technology accelerates. We identify five tractable challenges and argue this is a promising and underused research method 🧵

thumb_up_off_alt320

chat_bubble_outline22

repeat68

shareShare

Ben Bucknall

@ben_s_bucknall

5 months ago

📢 Over the moon that Open Problems in Technical AI Governance has now been published at TMLR! See the updated version here: shorturl.at/joQJS

thumb_up_off_alt202

chat_bubble_outline6

repeat46

shareShare

Hugo Larochelle

@hugo_larochelle

5 months ago

Consciousness is a fascinating topic. But personally, I'd rather resources be directed towards preventing (human) harms coming from people mistakenly believing an AI system is conscious.

thumb_up_off_alt112

chat_bubble_outline5

repeat19

shareShare

Centre for the Governance of AI (GovAI)

@govai_

5 months ago

Apply for GovAI’s DC Fellowship! Fellows will join GovAI in Washington, DC for 3 months to conduct paid research on a topic of their choice, with mentorship from leading experts in the field of AI policy. Application Deadline: May 25, 2025 at 23:59 ET. governance.ai/post/dc-fellow…

thumb_up_off_alt18

chat_bubble_outline1

repeat6

shareShare

Myra Cheng

@chengmyra1

4 months ago

Dear ChatGPT, Am I the Asshole? While Reddit users might say yes, your favorite LLM probably won’t. We present Social Sycophancy: a new way to understand and measure sycophancy as how LLMs overly preserve users' self-image.

thumb_up_off_alt303

chat_bubble_outline12

repeat33

shareShare

Charvi Rastogi

@charvvvv_

3 months ago

Join MLCommons for a social at ACM FAccT 2025, where we're tackling the critical need for a unified and collective approach to AI safety. AI safety research is siloed, hindering the development of safe and robust AI systems that work for everyone.

thumb_up_off_alt11

chat_bubble_outline1

repeat4

shareShare

Alan Chan

@_achan96_

3 months ago

New blog post! AI agents are becoming increasingly capable, but will need new protocols and systems in order to work effectively and safely. Who should build such protocols and systems?

thumb_up_off_alt65

chat_bubble_outline3

repeat15

shareShare

Jamie Bernardi

@the_jbernardi

3 months ago

Important work. Non-Claude models seem to refuse reasoning about alignment faking, and have less intrinsic tendency for goal-guarding. Observing this diff is a step towards better aligning AI. I'm in awe that 2025 is seeing alignment become an increasingly empirical discipline!

thumb_up_off_alt167

chat_bubble_outline7

repeat16

shareShare

Arvind Narayanan

@random_walker

3 months ago

I wish data centers would offer tours to the public and schools could take field trips to them. They are the defining pieces of infrastructure of our generation, but unlike railroads, the grid, or anything else, we never get to see them and experience their scale.

thumb_up_off_alt536

chat_bubble_outline24

repeat70

shareShare

Emerging Technology Observatory

@emergingtechobs

3 months ago

🤩🤩🤩Saad Siddiqui and Lujain Ibrahim لجين إبراهيم adapted AGORA's taxonomy to compare US and Chinese documents on AI risk: "...despite strategic competition, there exist concrete opportunities for bilateral U.S. China cooperation in the development of responsible AI." 🔗🧵

🤩🤩🤩<a href="/Saad97Siddiqui/">Saad Siddiqui</a> and <a href="/lujainmibrahim/">Lujain Ibrahim لجين إبراهيم</a> adapted AGORA's taxonomy to compare US and Chinese documents on AI risk: "...despite strategic competition, there exist concrete opportunities for bilateral U.S. China cooperation in the development of responsible AI." 🔗🧵

thumb_up_off_alt4

chat_bubble_outline1

repeat3

shareShare

Neil Rathi

@neil_rathi

3 months ago

new paper 🌟 interpretation of uncertainty expressions like "i think" differs cross-linguistically. we show that (1) llms are sensitive to these differences but (2) humans overrely on their outputs across languages

thumb_up_off_alt86

chat_bubble_outline2

repeat18

shareShare

Lujain Ibrahim لجين إبراهيم

Markus Anderljung

Cas (Stephen Casper)

Taylor Sorensen

Iason Gabriel

Technical AI Governance @ ICML 2025

Jacy Reese Anthis

Ben Bucknall

Hugo Larochelle

Centre for the Governance of AI (GovAI)

Myra Cheng

Charvi Rastogi

Alan Chan

Jamie Bernardi

Arvind Narayanan

Emerging Technology Observatory

Neil Rathi