Natasha Jaques (@natashajaques) 's Twitter Profile
Natasha Jaques

@natashajaques

Assistant Professor @uwcse and Senior Research Scientist at @GoogleAI. Let's get off this app: bsky.app/profile/natash…

ID: 51257255

linkhttp://natashajaques.ai calendar_today26-06-2009 22:36:02

1,1K Tweet

28,28K Followers

1,1K Following

Inductive Biases in RL (@ibrlworkshop) 's Twitter Profile Photo

Announcing our first keynote! 🎤 Natasha Jaques (natashajaques.ai), Assistant Professor at the University of Washington and Senior Research Scientist at Google DeepMind, will speak on “Social Reinforcement Learning” — exploring multi-agent and human-AI interactions.

Announcing our first keynote! 🎤 

Natasha Jaques (natashajaques.ai), Assistant Professor at the University of Washington and Senior Research Scientist at Google DeepMind, will speak on “Social Reinforcement Learning” — exploring multi-agent and human-AI interactions.
Eric Jang (@ericjang11) 's Twitter Profile Photo

Revoking visas of Chinese students studying in critical fields like AI and Robotics is incredibly short-sighted and harmful to America’s long term prosperity. We want the best from every country to work for team America

hardmaru (@hardmaru) 's Twitter Profile Photo

New Paper! Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents A longstanding goal of AI research has been the creation of AI that can learn indefinitely. One path toward that goal is an AI that improves itself by rewriting its own code, including any code

Jeff Clune (@jeffclune) 's Twitter Profile Photo

Excited to introduce the Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents. We harness the power of open-ended algorithms to search for agentic systems that get better at coding, including improving their own code. It’s the Automated Design of Agentic Systems

Natasha Jaques (@natashajaques) 's Twitter Profile Photo

Currently, reinforcement learning from human feedback (RLHF) is the predominant method for ensuring LLMs are safe and aligned. And yet it provides no guarantees that they won’t say something harmful, copyrighted, or inappropriate. In our latest paper, we use online adversarial

Animesh Garg (@animesh_garg) 's Twitter Profile Photo

Natasha Jaques I am surprised that this needed to be empirically done! I always felt that this is obvious. perhaps ideas such as error propagation & accumulation and covariate shift are way of thinking for folks in sequential decision making but not for supervised learning🤷

Abhishek Gupta (@abhishekunique7) 's Twitter Profile Photo

Learned visuomotor policies are notoriously fragile, they break with changes in conditions like lighting, clutter, or object variations amongst other things. In Yunchu's latest work, we asked whether we could get these policies to be robust and generalizable with a clever

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Sully Media will trend to drugs - highly addictive, brain-rotting. It's early enough that it's not yet obvious to most, but late enough that it's already real.

Natasha Jaques (@natashajaques) 's Twitter Profile Photo

How can you train an adversarial cooperator? It would be great to use adversarial training to get robust human-AI cooperation, but if you directly train the cooperation partner with an adversarial objective, it will just sabotage the task. Our latest work, GOAT, uses a

Natasha Jaques (@natashajaques) 's Twitter Profile Photo

In our latest paper, we discovered a surprising result: training LLMs with self-play reinforcement learning on zero-sum games (like poker) significantly improves performance on math and reasoning benchmarks, zero-shot. Whaaat? How does this work? We analyze the results and find

Natasha Jaques (@natashajaques) 's Twitter Profile Photo

Excited to release our latest paper on a new multi-turn RL objective for training LLMs to learn how to learn to adapt to the user. By optimizing for intrinsic curiosity, the LLM learns how to ask a series of questions over the course of the conversation to improve the accuracy of