venktesh (On faculty job market) (@vvenki22) 's Twitter Profile
venktesh (On faculty job market)

@vvenki22

Postdoc researcher @tudelft wis, former PhD candidate @iiitdelhi .PM fellow 2021 ( for doctoral research ). Scholar: tinyurl.com/mrk5p5j3

ID: 838435305564827649

linkhttps://venkteshv.github.io calendar_today05-03-2017 17:05:29

5,5K Tweet

695 Followers

4,4K Following

James Zou (@james_y_zou) 's Twitter Profile Photo

Introducing Fractional Reasoning: a mechanistic method to quantitatively control how much thinking a LLM performs. tldr: we identify latent reasoning knobs in transformer embedding โžก๏ธ better inference compute approach that mitigates under/over-thinking arxiv.org/pdf/2506.15882

Introducing Fractional Reasoning: a mechanistic method to quantitatively control how much thinking a LLM performs.

tldr: we identify latent reasoning knobs in transformer embedding โžก๏ธ better inference compute approach that mitigates under/over-thinking arxiv.org/pdf/2506.15882
Peng Qi (@qi2peng2) 's Twitter Profile Photo

Seven years ago, I co-led a paper called ๐—›๐—ผ๐˜๐—ฝ๐—ผ๐˜๐—ค๐—” that has motivated and facilitated many #AI #Agents research works since. Today, I'm asking that you stop using HotpotQA blindly for agents research in 2025 and beyond. In my new blog post, I revisit the brief history of

Alexandre Dรฉfossez (@honualx) 's Twitter Profile Photo

We just released the TTS model that powers Unmute ๐Ÿ—ฃ๏ธ It offers low latency, high fidelity and the least pronunciation errors compared to a wide range of commercial and open source models ๐ŸŽฏ. Preprint is coming soon ๐Ÿ“‘. See the project page below to test and learn more ๐Ÿ‘‡

Sebastian Raschka (@rasbt) 's Twitter Profile Photo

If you're getting into LLMs, PyTorch is essential. And lot of folks asked for beginner-friendly material, so I put this together: PyTorch in One Hour: From Tensors to Multi-GPU Training (sebastianraschka.com/teaching/pytorโ€ฆ) ๐Ÿ“– ~1h to read through ๐Ÿ’ก Maybe the perfect weekend project!? Iโ€™ve

Yucen Lily Li (@yucenlily) 's Twitter Profile Photo

In our new ICML paper, we show that popular families of OOD detection procedures, such as feature and logit based methods, are fundamentally misspecified, answering a different question than โ€œis this point from a different distribution?โ€ arxiv.org/abs/2507.01831 [1/7]

Dmitry Krotov (@dimakrotov) 's Twitter Profile Photo

Lagrangians are often used in physics for deriving the energy of mechanical systems. But are they useful for neural networks and AI? It turns out they are extremely helpful for working with energy-based models and energy-based Associative Memories. You need to specify a

Lagrangians are often used in physics for deriving the energy of mechanical systems. But are they useful for neural networks and AI? 

It turns out they are extremely helpful for working with energy-based models and energy-based Associative Memories. You need to specify a
Kiran Purohit (on job market) (@kiranpurohit08) 's Twitter Profile Photo

If you are attending ICML Conference, catch our poster tomorrow (16th July) at the Poster session (4:30 p.m. PDT โ€” 7 p.m. PDT)! ๐ŸŽฅ Video: recorder-v3.slideslive.com/#/share?share=โ€ฆ ๐Ÿ“‘ Paper: openreview.net/pdf?id=cuqvlLBโ€ฆ ๐Ÿ’ป Code: github.com/kiranpurohit/Cโ€ฆ venktesh Sourangshu Bhattacharya Avishek Anand

If you are attending <a href="/icmlconf/">ICML Conference</a>, catch our poster tomorrow (16th July) at the Poster session (4:30 p.m. PDT โ€” 7 p.m. PDT)!

๐ŸŽฅ Video: recorder-v3.slideslive.com/#/share?share=โ€ฆ
๐Ÿ“‘ Paper: openreview.net/pdf?id=cuqvlLBโ€ฆ
๐Ÿ’ป Code: github.com/kiranpurohit/Cโ€ฆ
   
<a href="/vvenki22/">venktesh</a> <a href="/sourangshub/">Sourangshu Bhattacharya</a> <a href="/run4avi/">Avishek Anand</a>
Richard Suwandi @ICLR2025 (@richardcsuwandi) 's Twitter Profile Photo

BatchNorm wins the Test-of-Time Award at #ICML2025! ๐ŸŽ‰ BatchNorm revolutionized deep learning by addressing internal covariate shift, which can slow down learning, limits learning rates, and makes it difficult to train deep networks. By normalizing inputs within each

BatchNorm wins the Test-of-Time Award at #ICML2025! ๐ŸŽ‰

BatchNorm revolutionized deep learning by addressing internal covariate shift, which can slow down learning, limits learning rates, and makes it difficult to train deep networks.

By normalizing inputs within each
Lucas Torroba-Hennigen (@ltorroba1) 's Twitter Profile Photo

Previous work has established that training a linear layer with GaLore is the same as training it with a half-frozen LoRA adapter. But how far can we push this equivalence? Read our paper, or come to our poster session at #ICML2025 on Wednesday at 4:30pm, to find out! ๐Ÿ“„:

Previous work has established that training a linear layer with GaLore is the same as training it with a half-frozen LoRA adapter. But how far can we push this equivalence?

Read our paper, or come to our poster session at #ICML2025 on Wednesday at 4:30pm, to find out!

๐Ÿ“„:
Mandeep Rathee (@rathee_mandeep) 's Twitter Profile Photo

I will present our paper โ€œBreaking the lens of the Telescope: Online Relevance Estimation over Large Retrieval Setsโ€ at #SIGIR2025 ๐Ÿ•ฐ๏ธ 10:30 AM (16.07.2025) ๐Ÿ“Location: GIOTTO (Floor 0) Full Paper: dl.acm.org/doi/10.1145/37โ€ฆ Slides: sigir2025.dei.unipd.it/detailed-progrโ€ฆ

I will present our paper โ€œBreaking the lens of the Telescope: Online Relevance Estimation over Large Retrieval Setsโ€ at #SIGIR2025
๐Ÿ•ฐ๏ธ 10:30 AM (16.07.2025) 
๐Ÿ“Location: GIOTTO (Floor 0)
Full Paper: dl.acm.org/doi/10.1145/37โ€ฆ
Slides: sigir2025.dei.unipd.it/detailed-progrโ€ฆ
Bodhisattwa Majumder (@mbodhisattwa) 's Twitter Profile Photo

Excited to share what I have been focusing on this year! Inference-time search to optimize Bayesian surprise pushes us towards long-horizon discovery! Introducing "AutoDS": Autonomous Discovery via Surprisal. "It can not only find the diamond in the rough, but also can rule out

Excited to share what I have been focusing on this year!

Inference-time search to optimize Bayesian surprise pushes us towards long-horizon discovery! Introducing "AutoDS": Autonomous Discovery via Surprisal.

"It can not only find the diamond in the rough, but also can rule out
Mandeep Rathee (@rathee_mandeep) 's Twitter Profile Photo

๐ŸŽ‰ Just wrapped up an incredible experience at #SIGIR2025 in beautiful Padova, Italy! Had the privilege of presenting my research paper and connecting with brilliant minds from the IR community.ย  Big thanks to amazing collaborators venktesh, Sean MacAvaney, and Avishek Anand

๐ŸŽ‰ Just wrapped up an incredible experience at #SIGIR2025 in beautiful Padova, Italy!
Had the privilege of presenting my research paper and connecting with brilliant minds from the IR community.ย  Big thanks to amazing collaborators <a href="/vvenki22/">venktesh</a>, <a href="/macavaney/">Sean MacAvaney</a>, and <a href="/run4avi/">Avishek Anand</a>
Chen-Yu Lee (@chl260) 's Twitter Profile Photo

Thrilled to introduce "๐——๐—ฒ๐—ฒ๐—ฝ ๐—ฅ๐—ฒ๐˜€๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต๐—ฒ๐—ฟ ๐˜„๐—ถ๐˜๐—ต ๐—ง๐—ฒ๐˜€๐˜-๐—ง๐—ถ๐—บ๐—ฒ ๐——๐—ถ๐—ณ๐—ณ๐˜‚๐˜€๐—ถ๐—ผ๐—ป," a new deep research agent designed to mimic the iterative nature of human research, complete with cycles of planning, drafting, and revision. ๐Ÿš€๐Ÿš€ arxiv.org/pdf/2507.16075

Thrilled to introduce "๐——๐—ฒ๐—ฒ๐—ฝ ๐—ฅ๐—ฒ๐˜€๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต๐—ฒ๐—ฟ ๐˜„๐—ถ๐˜๐—ต ๐—ง๐—ฒ๐˜€๐˜-๐—ง๐—ถ๐—บ๐—ฒ ๐——๐—ถ๐—ณ๐—ณ๐˜‚๐˜€๐—ถ๐—ผ๐—ป," a new deep research agent designed to mimic the iterative nature of human research, complete with cycles of planning, drafting, and revision. ๐Ÿš€๐Ÿš€

arxiv.org/pdf/2507.16075
Niloofar (on faculty job market!) (@niloofar_mire) 's Twitter Profile Photo

๐Ÿงต Academic job market season is almost here! There's so much rarely discussedโ€”nutrition, mental and physical health, uncertainty, and more. I'm sharing my statements, essential blogs, and personal lessons here, with more to come in the upcoming weeks! โฌ‡๏ธ (1/N)

Nate Chen (@chengua46724992) 's Twitter Profile Photo

Why do FFNs use ReLU instead of more precise ones like Exp? "We propose the following hypothesis: A kernel with lower retrieval precision encourages a more polysemantic keyโ€“value memory: multiple unrelated facts can be stored under the same key space" Great and inspiring read!

Why do FFNs use ReLU instead of more precise ones like Exp?

"We propose the following hypothesis: A kernel with lower retrieval precision encourages a more polysemantic keyโ€“value memory: multiple unrelated facts can be stored under the same key space"

Great and inspiring read!
Zihan Wang - on RAGEN (@wzihanw) 's Twitter Profile Photo

To guys diving into fine-tuning open-source MoEs today: check out ESFT, our customized PEFT method for MoE models. Train with 90% less parameters, gain 95%+ task perf and keep 98% general perf :)