Tu Trinh
@thetututrain
Aka Alina Trinh. ML research engineer @scale_AI | EECS MS @UCBerkeley @CHAI_Berkeley @berkeley_ai
ID: 1746648602183905280
14-01-2024 21:41:24
3 Tweet
37 Followers
124 Following
How can a robot self-assess when it has received enough demonstrations to perform a task correctly? Excited to present our work at #HRI2024 during Tuesday's session on Learning! Paper arxiv.org/abs/2211.15542 w/ Haoyu Chen and Daniel Brown
Our paper "From Noise to Narrative: Tracing the Origins of Hallucinations in Transformers" was accepted into NeurIPS! NeurIPS Conference We show that SAEs _are_ indeed useful for safety applications! SAEs can reliably detect, and meaningfully suppress, hallucinations.