Rishub Jain (@shubadubadub) Twitter Tweets • TwiCopy

Rishub Jain

@shubadubadub

+ Follow

Research Engineer at @GoogleDeepMind, currently working on Safe+Ethical AI

ID: 370444519

linkhttp://rishubjain.github.io calendar_today09-09-2011 01:27:44

74 Tweet

235 Followers

421 Following

Gate.io

@gate_io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

New Google DeepMind safety paper! LLM agents are coming – how do we stop them finding complex plans to hack the reward? Our method, MONA, prevents many such hacks, *even if* humans are unable to detect them! Inspired by myopic optimization but better performance – details in🧵

thumb_up_off_alt576

chat_bubble_outline16

repeat97

shareShare

David Lindner

@davlindner

6 months ago

Want to join one of the best AI safety teams in the world? We're hiring Google DeepMind! We have open positions for research engineers and research scientists in the AGI Safety & Alignment and Gemini Safety teams. Locations: London, Zurich, New York, Mountain View and SF

thumb_up_off_alt317

chat_bubble_outline6

repeat19

shareShare

Rohin Shah

@rohinmshah

6 months ago

We're hiring! Join an elite team that sets an AGI safety approach for all of Google -- both through development and implementation of the Frontier Safety Framework (FSF), and through research that enables a future stronger FSF.

thumb_up_off_alt297

chat_bubble_outline11

repeat37

shareShare

Arthur Conmy

@arthurconmy

5 months ago

We are hiring Applied Interpretability researchers on the GDM Mech Interp Team!🧵 If interpretability is ever going to be useful, we need it to be applied at the frontier. Come work with Neel Nanda, the Google DeepMind AGI Safety team, and me: apply by 28th February as a

thumb_up_off_alt283

chat_bubble_outline2

repeat35

shareShare

Google DeepMind

@googledeepmind

4 months ago

AGI could revolutionize many fields - from healthcare to education - but it's crucial that it’s developed responsibly. Today, we’re sharing how we’re thinking about safety and security on the path to AGI. → goo.gle/3R08XcD

thumb_up_off_alt1,1K

chat_bubble_outline64

repeat185

shareShare

Rohin Shah

@rohinmshah

4 months ago

Just released GDM’s 100+ page approach to AGI safety & security! (Don’t worry, there’s a 10 page summary.) AGI will be transformative. It enables massive benefits, but could also pose risks. Responsible development means proactively preparing for severe harms before they arise.

thumb_up_off_alt362

chat_bubble_outline13

repeat68

shareShare

Sophia

@sopharicks

4 months ago

Thanks to Sophie Bridgers and Rishub Jain for sharing with the BuzzRobot community the Google DeepMind framework on how AI and humans can complement each other and create synergy! Watch the lecture on our YouTube channel: youtu.be/IeXaiCvPM_E

thumb_up_off_alt6

chat_bubble_outline0

repeat3

shareShare

Andreas Terzis

@aterzis

4 months ago

1/3 🚨 AGI agents are venturing into untrusted territories, but current LLMs face vulnerabilities like prompt injections. How do we ensure their safety? 🤔

thumb_up_off_alt57

chat_bubble_outline2

repeat13

shareShare

Saffron Huang

@saffronhuang

3 months ago

I have a new piece out in Noema Magazine today with sam manning on how and why we should ensure broad ownership in AI. UBI is not the answer to the threat of automation. We need capital-based approaches (human, productive, financial capital) to mitigate economic/political power

thumb_up_off_alt131

chat_bubble_outline8

repeat25

shareShare

Xiangyu Qi

@xiangyuqi_pton

3 months ago

Thrilled to know that our paper, `Safety Alignment Should be Made More Than Just a Few Tokens Deep`, received the ICLR 2025 Outstanding Paper Award. We sincerely thank the ICLR committee for awarding one of this year's Outstanding Paper Awards to AI Safety / Adversarial ML.

thumb_up_off_alt347

chat_bubble_outline20

repeat30

shareShare