Jaehoon Lee (@hoonkp) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Ethan Dyer

@ethansdyer

3 years ago

1/ Super excited to introduce #Minerva 🦉(goo.gle/3yGpTN7). Minerva was trained on math and science found on the web and can solve many multi-step quantitative reasoning problems.

thumb_up_off_alt2,2K

chat_bubble_outline29

repeat526

shareShare

🧮 I finally spent some time learning what exactly Neural Tangent Kernel (NTK) is and went through some mathematical proof. Hopefully after reading this, you will not feel all the math behind NTK is that scaring, but rather, quite intuitive. lilianweng.github.io/posts/2022-09-…

thumb_up_off_alt1,1K

chat_bubble_outline13

repeat182

shareShare

Sam Altman

@sama

3 years ago

the deadline for applying to the OpenAI residency is tomorrow. if you are an engineer or researcher from any field who wants to start working on AI, please consider applying. many of our best people have come from this program! boards.greenhouse.io/openai/jobs/46… boards.greenhouse.io/openai/jobs/46…

thumb_up_off_alt360

chat_bubble_outline21

repeat51

shareShare

Jaehoon Lee

@hoonkp

3 years ago

Very interesting paper by James Sully, Dan Roberts and Alex Maloney investigating theoretical origin of neural scaling laws! Happy to read the 97p paper and learn about new tools in RMT and insights of how statistics of natural datasets are translated into power-law scaling.

thumb_up_off_alt9

chat_bubble_outline0

repeat1

shareShare

James Harrison

@jmes_harrison

3 years ago

Tired of tuning your neural network optimizer? Wish there was an optimizer that just worked? We’re excited to release VeLO 🚲, the first hyperparameter-free learned optimizer that outperforms hand-designed optimizers on real-world problems: velo-code.github.io 🧵

thumb_up_off_alt926

chat_bubble_outline10

repeat166

shareShare

Jaehoon Lee

@hoonkp

3 years ago

Today at 11am CT, Hall J #806 we are presenting our paper on infinite width neural network kernels! We have methods to compute NTK/NNGP for extended set of activations + sketched embeddings for efficient approximation (100x) for compute intensive conv kernels! See you there!

thumb_up_off_alt10

chat_bubble_outline0

repeat1

shareShare

Zi Wang, Ph.D.

@ziwphd

3 years ago

Jasper Jasper talking about the ongoing journey towards BIG Gaussian processes! A team effort with Jaehoon Lee, Ben Adlam, Shreyas Padhy and Zachary Nado. Join us at NeurIPS GP workshop neurips.cc/virtual/2022/w…

Jasper <a href="/latentjasper/">Jasper</a> talking about the ongoing journey towards BIG Gaussian processes! A team effort with <a href="/hoonkp/">Jaehoon Lee</a>, Ben Adlam, <a href="/shreyaspadhy/">Shreyas Padhy</a> and <a href="/zacharynado/">Zachary Nado</a>. Join us at NeurIPS GP workshop neurips.cc/virtual/2022/w…

thumb_up_off_alt43

chat_bubble_outline0

repeat6

shareShare

Jaehoon Lee

@hoonkp

3 years ago

This is amazing opportunity to work on impactful problems in Large Language Models with cool people! Highly recommended!

thumb_up_off_alt5

chat_bubble_outline1

repeat0

shareShare

Jaehoon Lee

@hoonkp

2 years ago

Analyzing training instabilities in Transformers made more accessible by awesome work by Mitchell Wortsman during his internship at Google DeepMind! We encourage you to think more on understanding the fundamental cause and effect of training instabilities as the models scale up!

thumb_up_off_alt26

chat_bubble_outline0

repeat4

shareShare

Jaehoon Lee

@hoonkp

2 years ago

This is an awesome opportunity to work with strong collaborators on an impactful science problem! Highly recommended!

thumb_up_off_alt16

chat_bubble_outline0

repeat1

shareShare

Jaehoon Lee

@hoonkp

a year ago

Amazing progress made by trieu , Yuhuai (Tony) Wu! Congratulations!!

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Noah Constant

@noahconst

a year ago

Ever wonder why we don’t train LLMs over highly compressed text? Turns out it’s hard to make it work. Check out our paper for some progress that we’re hoping others can build on. arxiv.org/abs/2404.03626 With Brian Lester, Jaehoon Lee, Alex Alemi, Jeffrey Pennington, Adam Roberts, Jascha Sohl-Dickstein

thumb_up_off_alt76

chat_bubble_outline2

repeat10

shareShare

Brian Lester

@blester125

a year ago

Is Kevin onto something? We found that LLMs can struggle to understand compressed text, unless you do some specific tricks. Check out arxiv.org/abs/2404.03626 and help Jaehoon Lee, Alex Alemi, Jeffrey Pennington, Adam Roberts, Jascha Sohl-Dickstein, Noah Constant and I make Kevin’s dream a reality.

thumb_up_off_alt15

chat_bubble_outline0

repeat6

shareShare

Peter J. Liu

@peterjliu

a year ago

We recently open-sourced a relatively minimal implementation example of Transformer language model training in JAX, called NanoDO. If you stick to vanilla JAX components, the code is relatively straightforward to read -- the model file is <150 lines. We found it useful as a

thumb_up_off_alt278

chat_bubble_outline3

repeat60

shareShare

Peter J. Liu

@peterjliu

a year ago

It was a pleasure working on Gemma 2. The team is relatively small but very capable. Glad to see it get released. On the origin of techniques: 'like Grok', 'like Mistral', etc. is a weird way to describe them as they all originated at Google Brain/DeepMind and the way they ended

thumb_up_off_alt225

chat_bubble_outline5

repeat23

shareShare

Jaehoon Lee

@hoonkp

a year ago

Tour de force led by Katie Everett investigating the interplay between neural network parameterization and optimizers; the thread/paper includes lot of gems (theory insight, extensive empirics, and cool new tricks)!

thumb_up_off_alt12

chat_bubble_outline0

repeat0

shareShare

Kelvin Xu @ ICLR 🇸🇬

@imkelvinxu

3 months ago

woot woot 🎉

thumb_up_off_alt46

chat_bubble_outline2

repeat5

shareShare

Behnam Neyshabur

@bneyshabur

2 months ago

Ethan Dyer and I have started a new team at Anthropic — and we’re hiring! Our team is organized around the north star goal of building an AI scientist: a system capable of solving the long-term reasoning challenges and core capabilities needed to push the scientific

thumb_up_off_alt341

chat_bubble_outline6

repeat17

shareShare

Jaehoon Lee

@hoonkp

2 months ago

Claude 4 models are here 🎉 From research to engineering, safety to product - this launch showcases what's possible when the entire Anthropic team comes together. Honored to be part of this journey! Claude has been transforming my daily workflow, hope it does the same for you!

thumb_up_off_alt14

chat_bubble_outline0

repeat0

shareShare

Jaehoon Lee

Gate.io

Ethan Dyer

Lilian Weng

Sam Altman

Jaehoon Lee

James Harrison

Jaehoon Lee

Zi Wang, Ph.D.

Jaehoon Lee

Jaehoon Lee

Jaehoon Lee

Jaehoon Lee

Noah Constant

Brian Lester

Peter J. Liu

Peter J. Liu

Jaehoon Lee

Kelvin Xu @ ICLR 🇸🇬

Behnam Neyshabur

Jaehoon Lee