Linus Pin-Jie Lin (@linusdd44804) 's Twitter Profile
Linus Pin-Jie Lin

@linusdd44804

PhD @VT_CS, Master @LstSaar. Interested in efficient model development & modular LMs

ID: 1115673042414211072

linkhttps://pjlintw.github.io/ calendar_today09-04-2019 17:49:16

109 Tweet

70 Followers

338 Following

Prateek Yadav (@prateeky2806) 's Twitter Profile Photo

Ever wondered if model merging works at scale? Maybe the benefits wear off for bigger models? Maybe you considered using model merging for post-training of your large model but not sure if it generalizes well? cc: Google AI Google DeepMind UNC NLP 🧵👇 Excited to announce my

Ever wondered if model merging works at scale? Maybe the benefits wear off for bigger models?

Maybe you considered using model merging for post-training of your large model but not sure if it  generalizes well?

cc: <a href="/GoogleAI/">Google AI</a> <a href="/GoogleDeepMind/">Google DeepMind</a> <a href="/uncnlp/">UNC NLP</a>
🧵👇

Excited to announce my
Tu Vu (@tuvllms) 's Twitter Profile Photo

🚨 New paper 🚨 Excited to share my first paper w/ my PhD students!! We find that advanced LLM capabilities conferred by instruction or alignment tuning (e.g., SFT, RLHF, DPO, GRPO) can be encoded into model diff vectors (à la task vectors) and transferred across model

🚨 New paper 🚨

Excited to share my first paper w/ my PhD students!!

We find that advanced LLM capabilities conferred by instruction or alignment tuning (e.g., SFT, RLHF, DPO, GRPO) can be encoded into model diff vectors (à la task vectors) and transferred across model
Linus Pin-Jie Lin (@linusdd44804) 's Twitter Profile Photo

Check out our new paper: just apply a diff vector from an instruction-tuned model to another non-instruction-tuned model. That’s it. ✅ Instruction-following transfers ✅ Multilingual improves ✅ Alignment (SFT, DPO, RLVR, GRPO) carries over Huge thx to all my collaborators!

Tsendsuren (@tsendeemts) 's Twitter Profile Photo

Almost 7 years ago, Tu Vu and I wrote our first paper together, one of few. It is fantastic to see the first paper by Tu’s students this time. Congratulations and looking forward to many such great works from Tu’s group!

Linus Pin-Jie Lin (@linusdd44804) 's Twitter Profile Photo

My first PhD paper is out 😆 took 7 months and lots of back-and-forth. Learned so much from Tu — sharp thinking, real feedback, and always pushing the idea further. Also, shoutout to my collaborators and the folks at Virginia Tech Computer Science!

Siva Reddy (@sivareddyg) 's Twitter Profile Photo

Introducing the DeepSeek-R1 Thoughtology -- the most comprehensive study of R1 reasoning chains/thoughts ✨. Probably everything you need to know about R1 thoughts. If we missed something, please let us know.

Tu Vu (@tuvllms) 's Twitter Profile Photo

📢 Research internship Google📢 I am looking for a PhD student researcher to work with me and my colleagues on advanced reasoning and/or RAG factuality this summer Google Mountain View, CA. We will focus on open-source models and benchmarks, and aim to publish our findings.

Eran Malach (@eranmalach) 's Twitter Profile Photo

How does RL improve performance on math reasoning? Studying RL from pretrained models is hard, as behavior depends on choice of base model. 🚨 In our new work, we train models *from scratch* to study the effect of the data mix on the behavior of RL. arxiv.org/abs/2504.07912

How does RL improve performance on math reasoning? Studying RL from pretrained models is hard, as behavior depends on choice of base model. 🚨 In our new work, we train models *from scratch* to study the effect of the data mix on the behavior of RL. arxiv.org/abs/2504.07912
Tu Vu (@tuvllms) 's Twitter Profile Photo

✨ New paper ✨ 🚨 Scaling test-time compute can lead to inverse or flattened scaling!! We introduce SealQA, a new challenge benchmark w/ questions that trigger conflicting, ambiguous, or unhelpful web search results. Key takeaways: ➡️ Frontier LLMs struggle on Seal-0 (SealQA’s

✨ New paper ✨
🚨 Scaling test-time compute can lead to inverse or flattened scaling!!

We introduce SealQA, a new challenge benchmark w/ questions that trigger conflicting, ambiguous, or unhelpful web search results. Key takeaways:

➡️ Frontier LLMs struggle on Seal-0 (SealQA’s
Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

More thinking power at test-time doesn't fix noisy-search problems—SealQA proves it. AI's reasoning capabilities fall flat when web search turns messy, and SealQA quantifies that. SealQA introduces an exceptionally challenging benchmark for search-augmented language models,

More thinking power at test-time doesn't fix noisy-search problems—SealQA proves it.

AI's reasoning capabilities fall flat when web search turns messy, and SealQA quantifies that.

SealQA introduces an exceptionally challenging benchmark for search-augmented language models,
Tu Vu (@tuvllms) 's Twitter Profile Photo

Excited to share that our paper on model merging at scale has been accepted to Transactions on Machine Learning Research (TMLR). Huge congrats to my intern Prateek Yadav and our awesome co-authors Jonathan Lai, Alexandra Chronopoulou, Manaal Faruqui, Mohit Bansal, and Tsendsuren 🎉!!

Excited to share that our paper on model merging at scale has been accepted to Transactions on Machine Learning Research (TMLR). Huge congrats to my intern <a href="/prateeky2806/">Prateek Yadav</a> and our awesome co-authors <a href="/_JLai/">Jonathan Lai</a>, <a href="/alexandraxron/">Alexandra Chronopoulou</a>, <a href="/manaalfar/">Manaal Faruqui</a>, <a href="/mohitban47/">Mohit Bansal</a>, and <a href="/TsendeeMTS/">Tsendsuren</a> 🎉!!
Tsendsuren (@tsendeemts) 's Twitter Profile Photo

This work got accepted at Transactions on Machine Learning Research (TMLR). Congratulations to Prateek Yadav and my co-authors. Also, thank you to the reviewers and editors for their time.

Thinh (@thinhphp_vt) 's Twitter Profile Photo

DeepSeek achieved a strong result on SEAL0, a challenging benchmark for reasoning with conflicting search results. 🎊

Thinking Machines (@thinkymachines) 's Twitter Profile Photo

LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.

LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.