Vaibhav Adlakha (@vaibhav_adlakha) 's Twitter Profile
Vaibhav Adlakha

@vaibhav_adlakha

PhD candidate @MILAMontreal and @mcgillu | RA @iitdelhi | Maths & CS undegrad from @IITGuwahati Interested in #NLProc

ID: 3536420677

calendar_today12-09-2015 09:20:05

268 Tweet

889 Followers

1,1K Following

Alexandra Chronopoulou (@alexandraxron) 's Twitter Profile Photo

We are organizing Repl4NLP 2025 along with Freda Shi Giorgos Vernikos Vaibhav Adlakha Xiang Lorraine Li Bodhisattwa Majumder. The workshop will be co-located with NAACL 2025 in Albuquerque, New Mexico and we plan to have a great panel of speakers. Consider submitting your coolest work!

Xing Han Lu (@xhluca) 's Twitter Profile Photo

Glad to see BM25S (bm25s.github.io) has been downloaded 1M times on PyPi 🎉 Numbers aside, it makes me happy to hear the positive experience from friends working on retrieval. It's good to know that people near me are enjoying it! Discussion: github.com/xhluca/bm25s/d…

Glad to see BM25S (bm25s.github.io) has been downloaded 1M times on PyPi 🎉

Numbers aside, it makes me happy to hear the positive experience from friends working on retrieval. It's good to know that people near me are enjoying it!

Discussion: github.com/xhluca/bm25s/d…
Sai Rajeswar (@rajeswarsai) 's Twitter Profile Photo

We're happy to report that our paper "BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks" has been accepted at ICLR and is available now at 🧐 arxiv.org/abs/2412.04626, congratulations to my dream team 📷👍 ServiceNow Research Mila - Institut québécois d'IA #ICLR2025

We're happy to report that our paper "BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks" has been accepted at ICLR and is available now at 🧐  arxiv.org/abs/2412.04626, congratulations to my dream team 📷👍 <a href="/ServiceNowRSRCH/">ServiceNow Research</a>  <a href="/Mila_Quebec/">Mila - Institut québécois d'IA</a> #ICLR2025
Ahmed Masry (@ahmed_masry97) 's Twitter Profile Photo

Happy to announce AlignVLM📏: a novel approach to bridging vision and language latent spaces for multimodal understanding in VLMs! 🌍📄🖼️ 🔗 Read the paper: arxiv.org/abs/2502.01341 🧵👇 Thread

Happy to announce AlignVLM📏: a novel approach to bridging vision and language latent spaces for multimodal understanding in VLMs! 🌍📄🖼️

🔗 Read the paper: arxiv.org/abs/2502.01341
🧵👇 Thread
Mushtaq Bilal, PhD (@mushtaqbilalphd) 's Twitter Profile Photo

Meta illegaly downloaded 80+ terabytes of books from LibGen, Anna's Archive, and Z-library to train their AI models. In 2010, Aaron Swartz downloaded only 70 GBs of articles from JSTOR (0.0875% of Meta). Faced $1 million in fine and 35 years in jail. Took his own life in 2013.

Meta illegaly downloaded 80+ terabytes of books from LibGen, Anna's Archive, and Z-library to train their AI models.

In 2010, Aaron Swartz downloaded only 70 GBs of articles from JSTOR (0.0875% of Meta). Faced $1 million in fine and 35 years in jail. Took his own life in 2013.
Vaibhav Adlakha (@vaibhav_adlakha) 's Twitter Profile Photo

Check out the new MMTEB benchmark🙌 if you are looking for an extensive, reproducible and open-source evaluation of text embedders!

Vaibhav Adlakha (@vaibhav_adlakha) 's Twitter Profile Photo

LLM agents can be used for harmful and malicious intents. 🤬 Check out SafeArena for comprehensive evaluation of LLM agents!🛠️

Jacob Springer (@jacspringer) 's Twitter Profile Photo

Training with more data = better LLMs, right? 🚨 False! Scaling language models by adding more pre-training data can decrease your performance after post-training! Introducing "catastrophic overtraining." 🥁🧵+arXiv 👇 1/9

Training with more data = better LLMs, right? 🚨

False! Scaling language models by adding more pre-training data can decrease your performance after post-training!

Introducing "catastrophic overtraining." 🥁🧵+arXiv 👇

1/9
Vaibhav Adlakha (@vaibhav_adlakha) 's Twitter Profile Photo

Check out our comprehensive study and analysis of DeepSeek’s 🐳 reasoning chains! This opens new dimension to analyse the working of LLMs. Incredible effort by our research group!

Amirhossein Kazemnejad (@a_kazemnejad) 's Twitter Profile Photo

Introducing nanoAhaMoment: Karpathy-style, single file RL for LLM library (<700 lines) - super hackable - no TRL / Verl, no abstraction💆‍♂️ - Single GPU, full param tuning, 3B LLM - Efficient (R1-zero countdown < 10h) comes with a from-scratch, fully spelled out YT video [1/n]

Introducing nanoAhaMoment: Karpathy-style, single file RL for LLM library (&lt;700 lines)

- super hackable
- no TRL / Verl, no abstraction💆‍♂️
- Single GPU, full param tuning, 3B LLM
- Efficient (R1-zero countdown &lt; 10h)

comes with a from-scratch, fully spelled out YT video [1/n]
Xing Han Lu (@xhluca) 's Twitter Profile Photo

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories We are releasing the first benchmark to evaluate how well automatic evaluators, such as LLM judges, can evaluate web agent trajectories. We find that rule-based evals underreport success rates, and

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories  

We are releasing the first benchmark to evaluate how well automatic evaluators, such as LLM judges, can evaluate web agent trajectories.

We find that rule-based evals underreport success rates, and
🇺🇦 Dzmitry Bahdanau (@dbahdanau) 's Twitter Profile Photo

I am excited to open-source PipelineRL - a scalable async RL implementation with in-flight weight updates. Why wait until your bored GPUs finish all sequences? Just update the weights and continue inference! Code: github.com/ServiceNow/Pip… Blog: huggingface.co/blog/ServiceNo…

I am excited to open-source PipelineRL - a scalable async RL implementation with in-flight weight updates. Why wait until your bored GPUs finish all sequences? Just update the weights and continue inference!

Code: github.com/ServiceNow/Pip…
Blog: huggingface.co/blog/ServiceNo…
Ziling Cheng (@ziling_cheng) 's Twitter Profile Photo

Do LLMs hallucinate randomly? Not quite. Our #ACL2025 (Main) paper shows that hallucinations under irrelevant contexts follow a systematic failure mode — revealing how LLMs generalize using abstract classes + context cues, albeit unreliably. 📎 Paper: arxiv.org/abs/2505.22630 1/n

Do LLMs hallucinate randomly? Not quite. Our #ACL2025 (Main) paper shows that hallucinations under irrelevant contexts follow a systematic failure mode — revealing how LLMs generalize using abstract classes + context cues, albeit unreliably.

📎 Paper: arxiv.org/abs/2505.22630 1/n
Benno Krojer (@benno_krojer) 's Twitter Profile Photo

Excited to share the results of my internship research with AI at Meta, as part of a larger world modeling release! What subtle shortcuts are VideoLLMs taking on spatio-temporal questions? And how can we instead curate shortcut-robust examples at a large-scale? Details 👇🔬

Excited to share the results of my internship research with <a href="/AIatMeta/">AI at Meta</a>, as part of a larger world modeling release!

What subtle shortcuts are VideoLLMs taking on spatio-temporal questions?

And how can we instead curate shortcut-robust examples at a large-scale?

Details 👇🔬
Xing Han Lu (@xhluca) 's Twitter Profile Photo

"Build the web for agents, not agents for the web" This position paper argues that rather than forcing web agents to adapt to UIs designed for humans, we should develop a new interface optimized for web agents, which we call Agentic Web Interface (AWI).

"Build the web for agents, not agents for the web"

This position paper argues that rather than forcing web agents to adapt to UIs designed for humans, we should develop a new interface optimized for web agents, which we call Agentic Web Interface (AWI).
Verna Dankers (@vernadankers) 's Twitter Profile Photo

I miss Edinburgh and its wonderful people already!! Thanks to Tal Linzen and Edoardo Ponti for inspiring discussions during the viva! I'm now exchanging Arthur's Seat for Mont Royal to join Siva Reddy's wonderful lab Mila - Institut québécois d'IA 🤩

cohere (@cohere) 's Twitter Profile Photo

Cohere is excited to announce our new office in Montreal, QC! We look forward to contributing to the local AI landscape, collaborating with new and existing partners in the city, and growing our Montreal-based team. cohere.com/blog/montreal-…