Yash Savani (@yashsavani_) 's Twitter Profile
Yash Savani

@yashsavani_

PhD student @CSDatCMU with Zico Kolter | prev research scientist @abacusai, ml eng @primer_ai | prev prev CS+Stats @Stanford @StanfordAILab

ID: 162099298

linkhttps://www.yashsavani.com calendar_today02-07-2010 18:08:25

45 Tweet

262 Followers

679 Following

Dmytro Mishkin 🇺🇦 (@ducha_aiki) 's Twitter Profile Photo

Deep Equilibrium Optical Flow Estimation Shaojie Bai, Zhengyang Geng, Yash Savani, Zico Kolter tl;dr: DEQ ("infinite depth aka single layer", arxiv.org/abs/1909.01377) look like natural fit for optical flow estimation. arxiv.org/abs/2204.08442 github.com/locuslab/deq-f…

Deep Equilibrium Optical Flow Estimation

<a href="/shaojieb/">Shaojie Bai</a>, Zhengyang Geng, <a href="/yashsavani_/">Yash Savani</a>, <a href="/zicokolter/">Zico Kolter</a>

tl;dr: DEQ ("infinite  depth aka single layer", arxiv.org/abs/1909.01377) look like natural fit for optical flow estimation.

arxiv.org/abs/2204.08442
github.com/locuslab/deq-f…
Zhengyang Geng (@zhengyanggeng) 's Twitter Profile Photo

Happy to share our latest DEQ work with Shaojie Bai, Yash Savani, and Zico Kolter! DEQ flow now sets SOTA zero-shot generalization performance on KITTI-15, with over 20% error reduction and super strong efficiency! Paper and Code available at paperswithcode.com/paper/deep-equ…. #CVPR2022

Zico Kolter (@zicokolter) 's Twitter Profile Photo

Excited about this work with Asher Trockman Yash Savani (and others) on antidistillation sampling. It uses a nifty trick to efficiently generate samples that makes student models _worse_ when you train on samples. I spoke about it at Simons this past week. Links below.

Excited about this work with <a href="/ashertrockman/">Asher Trockman</a> <a href="/yashsavani_/">Yash Savani</a> (and others) on antidistillation sampling. It uses a nifty trick to efficiently generate samples that makes student models _worse_ when you train on samples. I spoke about it at Simons this past week. Links below.
Jeremy Cohen (@deepcohen) 's Twitter Profile Photo

I’ll be at ICLR next week presenting this paper co-written with Alex Damian (Alex Damian). Would love to meet and chat about optimization in deep learning! My DMs are open - please reach out via DM or email. openreview.net/forum?id=sIE2r…

YixuanEvenXu (@yixuanevenxu) 's Twitter Profile Photo

✨ Did you know that NOT using all generated rollouts in GRPO can boost your reasoning LLM? Meet PODS! We down-sample rollouts and train on just a fraction, delivering notable gains over vanilla GRPO. (1/7)

✨ Did you know that NOT using all generated rollouts in GRPO can boost your reasoning LLM? Meet PODS! We down-sample rollouts and train on just a fraction, delivering notable gains over vanilla GRPO. (1/7)
Simone Scardapane (@s_scardapane) 's Twitter Profile Photo

*Antidistillation Sampling* by Yash Savani Asher Trockman Zico Kolter et al. They modify the logits of a model with a penalty term that poisons potential distillation attempts (by estimating the downstream distillation loss). arxiv.org/abs/2504.13146

*Antidistillation Sampling*
by <a href="/yashsavani_/">Yash Savani</a> <a href="/ashertrockman/">Asher Trockman</a> <a href="/zicokolter/">Zico Kolter</a> et al.

They modify the logits of a model with a penalty term that poisons potential distillation attempts (by estimating the downstream distillation loss).

arxiv.org/abs/2504.13146