Elan Rosenfeld (@elanrosenfeld) 's Twitter Profile
Elan Rosenfeld

@elanrosenfeld

Researcher @GoogleAI, PhD @CarnegieMellon

ID: 1045447599090794497

linkhttp://cs.cmu.edu/~elan calendar_today27-09-2018 22:58:25

332 Tweet

1,1K Followers

193 Following

Elan Rosenfeld (@elanrosenfeld) 's Twitter Profile Photo

A sequence of videos of Will Smith eating spaghetti, overlaid with the shutterstock logo. In some clips he uses a fork and in others his hands overflow with spaghetti as he shovels it into his mouth. In each clip he is wearing a different outfit. One clip has two Will Smiths.

Francesco Orabona (@bremen79) 's Twitter Profile Photo

How you ever wondered why the KL divergence is in all the PAC-Bayes bounds? Are we sure is it the optimal choice? We now know: for sure KL is *not* the optimal one! New work with Ilja Kuzborskij, Kwang-Sung (Kwang) Jun , yulian wu, and Kyoungseok Jang x.com/StatMLPapers/s…

Vaishnavh Nagarajan (@_vaishnavh) 's Twitter Profile Photo

🗣️ “Next-token predictors can’t plan!” ⚔️ ​​“False! Every distribution is expressible as product of next-token probabilities!” 🗣️ In work w/ Gregor Bachmann , we carefully flesh out this emerging, fragmented debate & articulate a key new failure. 🔴 arxiv.org/abs/2403.06963

Samuel Sokota (@ssokota) 's Twitter Profile Photo

SOTA AI for games like poker & Hanabi rely on search methods that don’t scale to games w/ large amounts of hidden information. In our ICLR paper, we introduce simple search methods that scale to large games & get SOTA for Hanabi w/ 100x less compute. 1/N arxiv.org/abs/2304.13138

SOTA AI for games like poker & Hanabi rely on search methods that don’t scale to games w/ large amounts of hidden information.

In our ICLR paper, we introduce simple search methods that scale to large games & get SOTA for Hanabi w/ 100x less compute. 1/N

arxiv.org/abs/2304.13138
ML@CMU (@mlcmublog) 's Twitter Profile Photo

Imagine you're a data scientist who solves several related linear regression problems from the same application domain. Can you learn how to best use a combination of L1 and L2 regularization penalties? We show that you can! How much data is needed? blog.ml.cmu.edu/2024/04/12/how…

Elan Rosenfeld (@elanrosenfeld) 's Twitter Profile Photo

I can't speak for other schools, but I can tell you this is definitely not the case at CMU. This seems like a surefire way to lose your school's status as a top program.

Elan Rosenfeld (@elanrosenfeld) 's Twitter Profile Photo

Almost forgot the obligatory self-promotion... I'll be presenting this work this afternoon at ICLR, poster #148. Stop by to gain a new understanding of NN optimization!

Chandler Squires (@chandlersquires) 's Twitter Profile Photo

If you are submitting to NeurIPS: reward yourself with a talk after! If you are not: no excuses not to attend this talk! Either way, join us this week at CARE to hear Sorawit (James) Saengkyongam speak about representation learning for extrapolation. portal.valencelabs.com/events/post/id…

Zico Kolter (@zicokolter) 's Twitter Profile Photo

I'm thrilled to share that I will become the next Director of the Machine Learning Department at Carnegie Mellon. MLD is a true gem, a department dedicated entirely to ML. Faculty and past directors have been personal role models in my career. cs.cmu.edu/news/2024/kolt…

Aaron Roth (@aaroth) 's Twitter Profile Photo

Congrats to the best paper award winners at COLT 2024! learningtheory.org/colt2024/award… First up, The Price of Adaptivity in Stochastic Convex Optimization by Yair Carmon and Oliver Hinder:

Congrats to the best paper award winners at COLT 2024! learningtheory.org/colt2024/award… First up, The Price of Adaptivity in Stochastic Convex Optimization by Yair Carmon and Oliver Hinder:
Clément Canonne (on Blue🦋Sky) (@ccanonne_) 's Twitter Profile Photo

"So, unimaginative theoreticians of the world, unite and pursue problems that have been studied only once." #rememberingLuca lucatrevisan.wordpress.com/2006/11/07/on-…

Logan Kilpatrick (@officiallogank) 's Twitter Profile Photo

Say hello to Gemini 1.5 Flash-8B ⚡️, now available for production usage with: - 50% lower price (vs 1.5 Flash) - 2x higher rate limits (vs 1.5 Flash) - lower latency on small prompts (vs 1.5 Flash) developers.googleblog.com/en/gemini-15-f…

QC (@qiaochuyuan) 's Twitter Profile Photo

tried getting claude to write funny tweets starting from 9 examples. it generated some okay stuff but nothing that actually made me laugh. then i tried asking it to generate tweets written from its perspective rather than a human's and actually laughed. hmm

tried getting claude to write funny tweets starting from 9 examples. it generated some okay stuff but nothing that actually made me laugh. then i tried asking it to generate tweets written from its perspective rather than a human's and actually laughed. hmm
Logan Kilpatrick (@officiallogank) 's Twitter Profile Photo

It’s still an early version, but check out how the model handles a challenging puzzle involving both visual and textual clues: (2/3)

Elan Rosenfeld (@elanrosenfeld) 's Twitter Profile Photo

I learned a lot about LLM training dynamics during this project, led by Sara Kangaslahti. Surprisingly, we can find meaningful + interpretable breakthroughs in model capabilities which are non-obvious from just the aggregated loss. Check out the thread by Naomi Saphra for details!

Demis Hassabis (@demishassabis) 's Twitter Profile Photo

Official results are in - Gemini achieved gold-medal level in the International Mathematical Olympiad! 🏆 An advanced version was able to solve 5 out of 6 problems. Incredible progress - huge congrats to Thang Luong and the team! deepmind.google/discover/blog/…