Brian Lester (@blester125) 's Twitter Profile
Brian Lester

@blester125

Senior Research Engineer at Google Deep Mind working on parameter-efficient adaptation and few-shot generalization, mostly within NLP. View are my own. he/him

ID: 1584222612

linkhttps://blester125.com calendar_today10-07-2013 23:16:31

94 Tweet

452 Followers

244 Following

Brian Lester (@blester125) 's Twitter Profile Photo

In addition to the impressive performance gains, I'm incredibly excited about how this work opens new exploration of targeted transfer learning via prompt similarity. I can't wait to see what gets built on this!

Brian Lester (@blester125) 's Twitter Profile Photo

The blog post for my EMNLP 2021 paper on Prompt Tuning is out! Writing for a blog is pretty different than writing for a conference, so if anything was confusing in the paper maybe this will help it click (or you could have just asked me lol)

Tu Vu (@tuvllms) 's Twitter Profile Photo

Happy to share our soft prompt transfer (SPoT) paper made it to #ACL2022 🎉. On the SuperGLUE leaderboard, SPoT is the first parameter-efficient approach that is competitive with methods that tune billions of parameters. w/ Brian Lester, Noah Constant, @aboSamoor, Daniel Cer

Daniel Cer (@daniel_m_cer) 's Twitter Profile Photo

We are presenting SPoT: Better Frozen Model Adaption through Soft Prompt Transfer ACL 2025 today during the 2pm in-person ML for NLP poster session and tomorrow at the 7:30am virtual poster session (virtual session w/@tuvuumass). #acl2022 #NLProc #ACLinDublin #acl2022nlp

We are presenting SPoT: Better Frozen Model Adaption through Soft Prompt Transfer <a href="/aclmeeting/">ACL 2025</a> today during the 2pm in-person ML for NLP poster session and tomorrow at the 7:30am virtual poster session (virtual session w/@tuvuumass). #acl2022 #NLProc #ACLinDublin #acl2022nlp
Brian Lester (@blester125) 's Twitter Profile Photo

Am I missing something wrt to the name "gradient checkpointing"? Clearing cached activations and recomputing them in the backwards pass seems like the opposite of checkpointing. The name makes it sound like we are storing the activations on disk. docs.aws.amazon.com/sagemaker/late…

Tu Vu (@tuvllms) 's Twitter Profile Photo

While parameter-efficient tuning methods are originally proposed to reduce computation & storage costs, it turns out they can help overcome catastrophic forgetting and thus improve performance on zero-shot cross-lingual generation. Checkout our work Google AI EMNLP 2025👇1/10

While parameter-efficient tuning methods are originally proposed to reduce computation &amp; storage costs, it turns out they can help overcome catastrophic forgetting and thus improve performance on zero-shot cross-lingual generation. Checkout our work <a href="/GoogleAI/">Google AI</a> <a href="/emnlpmeeting/">EMNLP 2025</a>👇1/10
Brian Lester (@blester125) 's Twitter Profile Photo

.Motive, I saw PyTorch in the licenses for Dead Space. Are you using it as a GPU-accelerated linear algebra library or are there actually neural nets running during the game? #deadspace #deadspaceremake

Brian Lester (@blester125) 's Twitter Profile Photo

We just pushed a new update adding support for the (very impressive) safetensors library from our friends at Hugging Face! Git-Theta's plug-in system meant that we spent more time waiting on CI/CD than actually adding support (I'll get off my soapbox now 🧼📦).

Brian Lester (@blester125) 's Twitter Profile Photo

Is Kevin onto something? We found that LLMs can struggle to understand compressed text, unless you do some specific tricks. Check out arxiv.org/abs/2404.03626 and help Jaehoon Lee, Alex Alemi, Jeffrey Pennington, Adam Roberts, Jascha Sohl-Dickstein, Noah Constant and I make Kevin’s dream a reality.