venktesh (On faculty job market) (@vvenki22) 's Twitter Profile
venktesh (On faculty job market)

@vvenki22

Postdoc researcher @tudelft wis, former PhD candidate @iiitdelhi .PM fellow 2021 ( for doctoral research ). Scholar: tinyurl.com/mrk5p5j3

ID: 838435305564827649

linkhttps://venkteshv.github.io calendar_today05-03-2017 17:05:29

5,5K Tweet

695 Followers

4,4K Following

James Zou (@james_y_zou) 's Twitter Profile Photo

Introducing Fractional Reasoning: a mechanistic method to quantitatively control how much thinking a LLM performs. tldr: we identify latent reasoning knobs in transformer embedding ➡️ better inference compute approach that mitigates under/over-thinking arxiv.org/pdf/2506.15882

Introducing Fractional Reasoning: a mechanistic method to quantitatively control how much thinking a LLM performs.

tldr: we identify latent reasoning knobs in transformer embedding ➡️ better inference compute approach that mitigates under/over-thinking arxiv.org/pdf/2506.15882
Peng Qi (@qi2peng2) 's Twitter Profile Photo

Seven years ago, I co-led a paper called 𝗛𝗼𝘁𝗽𝗼𝘁𝗤𝗔 that has motivated and facilitated many #AI #Agents research works since. Today, I'm asking that you stop using HotpotQA blindly for agents research in 2025 and beyond. In my new blog post, I revisit the brief history of

Alexandre Défossez (@honualx) 's Twitter Profile Photo

We just released the TTS model that powers Unmute 🗣️ It offers low latency, high fidelity and the least pronunciation errors compared to a wide range of commercial and open source models 🎯. Preprint is coming soon 📑. See the project page below to test and learn more 👇

Sebastian Raschka (@rasbt) 's Twitter Profile Photo

If you're getting into LLMs, PyTorch is essential. And lot of folks asked for beginner-friendly material, so I put this together: PyTorch in One Hour: From Tensors to Multi-GPU Training (sebastianraschka.com/teaching/pytor…) 📖 ~1h to read through 💡 Maybe the perfect weekend project!? I’ve

Tianyu Zheng (@zhengtianyu4) 's Twitter Profile Photo

🚀 Thrilled to announce our new work: FR3E (First Return, Entropy-Eliciting Explore)! LLM reasoning with Reinforcement Learning often struggles with unstable and inefficient exploration. We propose FR3E, a structured framework to make it more robust & efficient.

🚀 Thrilled to announce our new work: FR3E (First Return, Entropy-Eliciting Explore)!

LLM reasoning with Reinforcement Learning often struggles with unstable and inefficient exploration. We propose FR3E, a structured framework to make it more robust & efficient.
Yucen Lily Li (@yucenlily) 's Twitter Profile Photo

In our new ICML paper, we show that popular families of OOD detection procedures, such as feature and logit based methods, are fundamentally misspecified, answering a different question than “is this point from a different distribution?” arxiv.org/abs/2507.01831 [1/7]

Dmitry Krotov (@dimakrotov) 's Twitter Profile Photo

Lagrangians are often used in physics for deriving the energy of mechanical systems. But are they useful for neural networks and AI? It turns out they are extremely helpful for working with energy-based models and energy-based Associative Memories. You need to specify a

Lagrangians are often used in physics for deriving the energy of mechanical systems. But are they useful for neural networks and AI? 

It turns out they are extremely helpful for working with energy-based models and energy-based Associative Memories. You need to specify a
Kiran Purohit (on job market) (@kiranpurohit08) 's Twitter Profile Photo

If you are attending ICML Conference, catch our poster tomorrow (16th July) at the Poster session (4:30 p.m. PDT — 7 p.m. PDT)! 🎥 Video: recorder-v3.slideslive.com/#/share?share=… 📑 Paper: openreview.net/pdf?id=cuqvlLB… 💻 Code: github.com/kiranpurohit/C… venktesh Sourangshu Bhattacharya Avishek Anand

If you are attending <a href="/icmlconf/">ICML Conference</a>, catch our poster tomorrow (16th July) at the Poster session (4:30 p.m. PDT — 7 p.m. PDT)!

🎥 Video: recorder-v3.slideslive.com/#/share?share=…
📑 Paper: openreview.net/pdf?id=cuqvlLB…
💻 Code: github.com/kiranpurohit/C…
   
<a href="/vvenki22/">venktesh</a> <a href="/sourangshub/">Sourangshu Bhattacharya</a> <a href="/run4avi/">Avishek Anand</a>
Richard Suwandi @ICLR2025 (@richardcsuwandi) 's Twitter Profile Photo

BatchNorm wins the Test-of-Time Award at #ICML2025! 🎉 BatchNorm revolutionized deep learning by addressing internal covariate shift, which can slow down learning, limits learning rates, and makes it difficult to train deep networks. By normalizing inputs within each

BatchNorm wins the Test-of-Time Award at #ICML2025! 🎉

BatchNorm revolutionized deep learning by addressing internal covariate shift, which can slow down learning, limits learning rates, and makes it difficult to train deep networks.

By normalizing inputs within each
Lucas Torroba-Hennigen (@ltorroba1) 's Twitter Profile Photo

Previous work has established that training a linear layer with GaLore is the same as training it with a half-frozen LoRA adapter. But how far can we push this equivalence? Read our paper, or come to our poster session at #ICML2025 on Wednesday at 4:30pm, to find out! 📄:

Previous work has established that training a linear layer with GaLore is the same as training it with a half-frozen LoRA adapter. But how far can we push this equivalence?

Read our paper, or come to our poster session at #ICML2025 on Wednesday at 4:30pm, to find out!

📄:
Mandeep Rathee (@rathee_mandeep) 's Twitter Profile Photo

I will present our paper “Breaking the lens of the Telescope: Online Relevance Estimation over Large Retrieval Sets” at #SIGIR2025 🕰️ 10:30 AM (16.07.2025) 📍Location: GIOTTO (Floor 0) Full Paper: dl.acm.org/doi/10.1145/37… Slides: sigir2025.dei.unipd.it/detailed-progr…

I will present our paper “Breaking the lens of the Telescope: Online Relevance Estimation over Large Retrieval Sets” at #SIGIR2025
🕰️ 10:30 AM (16.07.2025) 
📍Location: GIOTTO (Floor 0)
Full Paper: dl.acm.org/doi/10.1145/37…
Slides: sigir2025.dei.unipd.it/detailed-progr…
Bodhisattwa Majumder (@mbodhisattwa) 's Twitter Profile Photo

Excited to share what I have been focusing on this year! Inference-time search to optimize Bayesian surprise pushes us towards long-horizon discovery! Introducing "AutoDS": Autonomous Discovery via Surprisal. "It can not only find the diamond in the rough, but also can rule out

Excited to share what I have been focusing on this year!

Inference-time search to optimize Bayesian surprise pushes us towards long-horizon discovery! Introducing "AutoDS": Autonomous Discovery via Surprisal.

"It can not only find the diamond in the rough, but also can rule out
Mandeep Rathee (@rathee_mandeep) 's Twitter Profile Photo

🎉 Just wrapped up an incredible experience at #SIGIR2025 in beautiful Padova, Italy! Had the privilege of presenting my research paper and connecting with brilliant minds from the IR community.  Big thanks to amazing collaborators venktesh, Sean MacAvaney, and Avishek Anand

🎉 Just wrapped up an incredible experience at #SIGIR2025 in beautiful Padova, Italy!
Had the privilege of presenting my research paper and connecting with brilliant minds from the IR community.  Big thanks to amazing collaborators <a href="/vvenki22/">venktesh</a>, <a href="/macavaney/">Sean MacAvaney</a>, and <a href="/run4avi/">Avishek Anand</a>
Chen-Yu Lee (@chl260) 's Twitter Profile Photo

Thrilled to introduce "𝗗𝗲𝗲𝗽 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵𝗲𝗿 𝘄𝗶𝘁𝗵 𝗧𝗲𝘀𝘁-𝗧𝗶𝗺𝗲 𝗗𝗶𝗳𝗳𝘂𝘀𝗶𝗼𝗻," a new deep research agent designed to mimic the iterative nature of human research, complete with cycles of planning, drafting, and revision. 🚀🚀 arxiv.org/pdf/2507.16075

Thrilled to introduce "𝗗𝗲𝗲𝗽 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵𝗲𝗿 𝘄𝗶𝘁𝗵 𝗧𝗲𝘀𝘁-𝗧𝗶𝗺𝗲 𝗗𝗶𝗳𝗳𝘂𝘀𝗶𝗼𝗻," a new deep research agent designed to mimic the iterative nature of human research, complete with cycles of planning, drafting, and revision. 🚀🚀

arxiv.org/pdf/2507.16075
Niloofar (on faculty job market!) (@niloofar_mire) 's Twitter Profile Photo

🧵 Academic job market season is almost here! There's so much rarely discussed—nutrition, mental and physical health, uncertainty, and more. I'm sharing my statements, essential blogs, and personal lessons here, with more to come in the upcoming weeks! ⬇️ (1/N)

Nate Chen (@chengua46724992) 's Twitter Profile Photo

Why do FFNs use ReLU instead of more precise ones like Exp? "We propose the following hypothesis: A kernel with lower retrieval precision encourages a more polysemantic key–value memory: multiple unrelated facts can be stored under the same key space" Great and inspiring read!

Why do FFNs use ReLU instead of more precise ones like Exp?

"We propose the following hypothesis: A kernel with lower retrieval precision encourages a more polysemantic key–value memory: multiple unrelated facts can be stored under the same key space"

Great and inspiring read!
Zihan Wang - on RAGEN (@wzihanw) 's Twitter Profile Photo

To guys diving into fine-tuning open-source MoEs today: check out ESFT, our customized PEFT method for MoE models. Train with 90% less parameters, gain 95%+ task perf and keep 98% general perf :)