Lilian Weng (@lilianweng) 's Twitter Profile
Lilian Weng

@lilianweng

Co-founder of Thinking Machines Lab @thinkymachines; Ex-VP, AI Safety & robotics, applied research @OpenAI; Author of Lil'Log

ID: 96999384

linkhttps://lilianweng.github.io calendar_today15-12-2009 15:17:40

187 Tweet

140,140K Followers

158 Following

Lilian Weng (@lilianweng) 's Twitter Profile Photo

Rule-based rewards (RBRs) use model to provide RL signals based on a set of safety rubrics, making it easier to adapt to changing safety policies wo/ heavy dependency on human data. It also enables us to look at safety and capability in a more unified lens as a more capable

Lilian Weng (@lilianweng) 's Twitter Profile Photo

Iterative deployment for maximizing AI safety learning needs to be built on top of rigorous science and process. We are learning and improving through each launch.

Lilian Weng (@lilianweng) 's Twitter Profile Photo

📢 We are hiring Research Scientists and Engineers for safety research at OpenAI, ranging from safe model behavior training, adversarial robustness, AI in healthcare, frontier risk evaluation and more. Please fill in this form if you are interested: jobs.ashbyhq.com/openai/form/oa…

Lilian Weng (@lilianweng) 's Twitter Profile Photo

After working at OpenAI for almost 7 years, I decide to leave. I learned so much and now I'm ready for a reset and something new. Here is the note I just shared with the team. 🩵

After working at OpenAI for almost 7 years, I decide to leave. I learned so much and now I'm ready for a reset and something new. 

Here is the note I just shared with the team. 🩵
Lilian Weng (@lilianweng) 's Twitter Profile Photo

🦃 At the end of Thanksgiving holidays, I finally finished the piece on reward hacking. Not an easy one to write, phew. Reward hacking occurs when an RL agent exploits flaws in the reward function or env to maximize rewards without learning the intended behavior. This is imo a

Mira Murati (@miramurati) 's Twitter Profile Photo

I started Thinking Machines Lab alongside a remarkable team of scientists, engineers, and builders. We're building three things: - Helping people adapt AI systems to work for their specific needs - Developing strong foundations to build more capable AI systems - Fostering a

Lilian Weng (@lilianweng) 's Twitter Profile Photo

When a new dataset comes out, I get excited and check it out and then only realize that this is another meta-mixed dataset combining a collections of other existing datasets. My brain immediately acts like "oh fork ... contamination!" No meta-meta-mixed dataset plzzzz :lolsob:

Lilian Weng (@lilianweng) 's Twitter Profile Photo

Probably the first product Thinky will build is a full panel of dials that researchers can use to physically adjust all the hparams during training. We gonna do hardware one day and it is the time 😂