Brian Lester (@blester125) Twitter Tweets • TwiCopy

Brian Lester

@blester125

+ Follow

Senior Research Engineer at Google Deep Mind working on parameter-efficient adaptation and few-shot generalization, mostly within NLP. View are my own. he/him

ID: 1584222612

linkhttps://blester125.com calendar_today10-07-2013 23:16:31

94 Tweet

452 Followers

244 Following

Brian Lester

@blester125

4 years ago

In addition to the impressive performance gains, I'm incredibly excited about how this work opens new exploration of targeted transfer learning via prompt similarity. I can't wait to see what gets built on this!

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Brian Lester

@blester125

4 years ago

blog.google/technology/ai/…

thumb_up_off_alt3

chat_bubble_outline1

repeat2

shareShare

Brian Lester

@blester125

4 years ago

The blog post for my EMNLP 2021 paper on Prompt Tuning is out! Writing for a blog is pretty different than writing for a conference, so if anything was confusing in the paper maybe this will help it click (or you could have just asked me lol)

thumb_up_off_alt14

chat_bubble_outline0

repeat2

shareShare

Tu Vu

@tuvllms

4 years ago

Happy to share our soft prompt transfer (SPoT) paper made it to #ACL2022 🎉. On the SuperGLUE leaderboard, SPoT is the first parameter-efficient approach that is competitive with methods that tune billions of parameters. w/ Brian Lester, Noah Constant, @aboSamoor, Daniel Cer

thumb_up_off_alt56

chat_bubble_outline2

repeat9

shareShare

Daniel Cer

@daniel_m_cer

4 years ago

We are presenting SPoT: Better Frozen Model Adaption through Soft Prompt Transfer ACL 2025 today during the 2pm in-person ML for NLP poster session and tomorrow at the 7:30am virtual poster session (virtual session w/@tuvuumass). #acl2022 #NLProc #ACLinDublin #acl2022nlp

We are presenting SPoT: Better Frozen Model Adaption through Soft Prompt Transfer <a href="/aclmeeting/">ACL 2025</a> today during the 2pm in-person ML for NLP poster session and tomorrow at the 7:30am virtual poster session (virtual session w/@tuvuumass). #acl2022 #NLProc #ACLinDublin #acl2022nlp

thumb_up_off_alt8

chat_bubble_outline1

repeat1

shareShare

Brian Lester

@blester125

3 years ago

Am I missing something wrt to the name "gradient checkpointing"? Clearing cached activations and recomputing them in the backwards pass seems like the opposite of checkpointing. The name makes it sound like we are storing the activations on disk. docs.aws.amazon.com/sagemaker/late…

thumb_up_off_alt1

chat_bubble_outline2

repeat0

shareShare

Tu Vu

@tuvllms

3 years ago

While parameter-efficient tuning methods are originally proposed to reduce computation & storage costs, it turns out they can help overcome catastrophic forgetting and thus improve performance on zero-shot cross-lingual generation. Checkout our work Google AI EMNLP 2025👇1/10

thumb_up_off_alt107

chat_bubble_outline1

repeat30

shareShare

Brian Lester

@blester125

3 years ago

.Motive, I saw PyTorch in the licenses for Dead Space. Are you using it as a GPU-accelerated linear algebra library or are there actually neural nets running during the game? #deadspace #deadspaceremake

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Brian Lester

@blester125

2 years ago

We just pushed a new update adding support for the (very impressive) safetensors library from our friends at Hugging Face! Git-Theta's plug-in system meant that we spent more time waiting on CI/CD than actually adding support (I'll get off my soapbox now 🧼📦).

thumb_up_off_alt20

chat_bubble_outline0

repeat4

shareShare

Brian Lester

@blester125

2 years ago

Is Kevin onto something? We found that LLMs can struggle to understand compressed text, unless you do some specific tricks. Check out arxiv.org/abs/2404.03626 and help Jaehoon Lee, Alex Alemi, Jeffrey Pennington, Adam Roberts, Jascha Sohl-Dickstein, Noah Constant and I make Kevin’s dream a reality.

thumb_up_off_alt15

chat_bubble_outline0

repeat6

shareShare