venktesh (On faculty job market) (@vvenki22) Twitter Tweets • TwiCopy

James Zou

5 months ago

Introducing Fractional Reasoning: a mechanistic method to quantitatively control how much thinking a LLM performs. tldr: we identify latent reasoning knobs in transformer embedding ➡️ better inference compute approach that mitigates under/over-thinking arxiv.org/pdf/2506.15882

thumb_up_off_alt166

chat_bubble_outline4

repeat31

shareShare

Peng Qi

@qi2peng2

5 months ago

Seven years ago, I co-led a paper called 𝗛𝗼𝘁𝗽𝗼𝘁𝗤𝗔 that has motivated and facilitated many #AI #Agents research works since. Today, I'm asking that you stop using HotpotQA blindly for agents research in 2025 and beyond. In my new blog post, I revisit the brief history of

thumb_up_off_alt222

chat_bubble_outline5

repeat44

shareShare

Alexandre Défossez

@honualx

5 months ago

We just released the TTS model that powers Unmute 🗣️ It offers low latency, high fidelity and the least pronunciation errors compared to a wide range of commercial and open source models 🎯. Preprint is coming soon 📑. See the project page below to test and learn more 👇

thumb_up_off_alt81

chat_bubble_outline1

repeat6

shareShare

Stat.ML Papers

@statmlpapers

5 months ago

Attention-based clustering ift.tt/nFR5huz

thumb_up_off_alt55

chat_bubble_outline0

repeat8

shareShare

Stat.ML Papers

@statmlpapers

5 months ago

Online Conformal Prediction with Efficiency Guarantees ift.tt/ERGxlo2

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Sebastian Raschka

@rasbt

5 months ago

If you're getting into LLMs, PyTorch is essential. And lot of folks asked for beginner-friendly material, so I put this together: PyTorch in One Hour: From Tensors to Multi-GPU Training (sebastianraschka.com/teaching/pytor…) 📖 ~1h to read through 💡 Maybe the perfect weekend project!? I’ve

thumb_up_off_alt2,2K

chat_bubble_outline49

repeat429

shareShare

Tianyu Zheng

@zhengtianyu4

5 months ago

🚀 Thrilled to announce our new work: FR3E (First Return, Entropy-Eliciting Explore)! LLM reasoning with Reinforcement Learning often struggles with unstable and inefficient exploration. We propose FR3E, a structured framework to make it more robust & efficient.

thumb_up_off_alt60

chat_bubble_outline1

repeat11

shareShare

Yucen Lily Li

@yucenlily

5 months ago

In our new ICML paper, we show that popular families of OOD detection procedures, such as feature and logit based methods, are fundamentally misspecified, answering a different question than “is this point from a different distribution?” arxiv.org/abs/2507.01831 [1/7]

thumb_up_off_alt240

chat_bubble_outline4

repeat49

shareShare

Stat.ML Papers

@statmlpapers

5 months ago

Hess-MC2: Sequential Monte Carlo Squared using Hessian Information and Second Order Proposals ift.tt/DSKYtrP

thumb_up_off_alt9

chat_bubble_outline0

repeat1

shareShare

Dmitry Krotov

@dimakrotov

5 months ago

Lagrangians are often used in physics for deriving the energy of mechanical systems. But are they useful for neural networks and AI? It turns out they are extremely helpful for working with energy-based models and energy-based Associative Memories. You need to specify a

thumb_up_off_alt684

chat_bubble_outline7

repeat109

shareShare

Kiran Purohit (on job market)

@kiranpurohit08

5 months ago

If you are attending ICML Conference, catch our poster tomorrow (16th July) at the Poster session (4:30 p.m. PDT — 7 p.m. PDT)! 🎥 Video: recorder-v3.slideslive.com/#/share?share=… 📑 Paper: openreview.net/pdf?id=cuqvlLB… 💻 Code: github.com/kiranpurohit/C… venktesh Sourangshu Bhattacharya Avishek Anand

If you are attending <a href="/icmlconf/">ICML Conference</a>, catch our poster tomorrow (16th July) at the Poster session (4:30 p.m. PDT — 7 p.m. PDT)!

🎥 Video: recorder-v3.slideslive.com/#/share?share=…
📑 Paper: openreview.net/pdf?id=cuqvlLB…
💻 Code: github.com/kiranpurohit/C…

<a href="/vvenki22/">venktesh</a> <a href="/sourangshub/">Sourangshu Bhattacharya</a> <a href="/run4avi/">Avishek Anand</a>

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

Richard Suwandi @ICLR2025

@richardcsuwandi

5 months ago

BatchNorm wins the Test-of-Time Award at #ICML2025! 🎉 BatchNorm revolutionized deep learning by addressing internal covariate shift, which can slow down learning, limits learning rates, and makes it difficult to train deep networks. By normalizing inputs within each

thumb_up_off_alt68

chat_bubble_outline2

repeat10

shareShare

Lucas Torroba-Hennigen

@ltorroba1

5 months ago

Previous work has established that training a linear layer with GaLore is the same as training it with a half-frozen LoRA adapter. But how far can we push this equivalence? Read our paper, or come to our poster session at #ICML2025 on Wednesday at 4:30pm, to find out! 📄:

thumb_up_off_alt14

chat_bubble_outline1

repeat3

shareShare

Mandeep Rathee

@rathee_mandeep

5 months ago

I will present our paper “Breaking the lens of the Telescope: Online Relevance Estimation over Large Retrieval Sets” at #SIGIR2025 🕰️ 10:30 AM (16.07.2025) 📍Location: GIOTTO (Floor 0) Full Paper: dl.acm.org/doi/10.1145/37… Slides: sigir2025.dei.unipd.it/detailed-progr…

thumb_up_off_alt10

chat_bubble_outline0

repeat5

shareShare

Bodhisattwa Majumder

@mbodhisattwa

4 months ago

Excited to share what I have been focusing on this year! Inference-time search to optimize Bayesian surprise pushes us towards long-horizon discovery! Introducing "AutoDS": Autonomous Discovery via Surprisal. "It can not only find the diamond in the rough, but also can rule out

thumb_up_off_alt168

chat_bubble_outline1

repeat29

shareShare

Mandeep Rathee

@rathee_mandeep

4 months ago

🎉 Just wrapped up an incredible experience at #SIGIR2025 in beautiful Padova, Italy! Had the privilege of presenting my research paper and connecting with brilliant minds from the IR community. Big thanks to amazing collaborators venktesh, Sean MacAvaney, and Avishek Anand

thumb_up_off_alt11

chat_bubble_outline0

repeat1

shareShare

Chen-Yu Lee

@chl260

4 months ago

Thrilled to introduce "𝗗𝗲𝗲𝗽 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵𝗲𝗿 𝘄𝗶𝘁𝗵 𝗧𝗲𝘀𝘁-𝗧𝗶𝗺𝗲 𝗗𝗶𝗳𝗳𝘂𝘀𝗶𝗼𝗻," a new deep research agent designed to mimic the iterative nature of human research, complete with cycles of planning, drafting, and revision. 🚀🚀 arxiv.org/pdf/2507.16075

thumb_up_off_alt451

chat_bubble_outline11

repeat82

shareShare

Niloofar (on faculty job market!)

@niloofar_mire

4 months ago

🧵 Academic job market season is almost here! There's so much rarely discussed—nutrition, mental and physical health, uncertainty, and more. I'm sharing my statements, essential blogs, and personal lessons here, with more to come in the upcoming weeks! ⬇️ (1/N)

thumb_up_off_alt245

chat_bubble_outline3

repeat36

shareShare

Nate Chen

@chengua46724992

4 months ago

Why do FFNs use ReLU instead of more precise ones like Exp? "We propose the following hypothesis: A kernel with lower retrieval precision encourages a more polysemantic key–value memory: multiple unrelated facts can be stored under the same key space" Great and inspiring read!

thumb_up_off_alt315

chat_bubble_outline2

repeat49

shareShare

Zihan Wang - on RAGEN

@wzihanw

4 months ago

To guys diving into fine-tuning open-source MoEs today: check out ESFT, our customized PEFT method for MoE models. Train with 90% less parameters, gain 95%+ task perf and keep 98% general perf :)

thumb_up_off_alt199

chat_bubble_outline1

repeat27

shareShare