Sushant Sachdeva (@sushnt) 's Twitter Profile
Sushant Sachdeva

@sushnt

Faculty@UToronto CS, Google research, Vector Institute affiliate.
Algorithms, Optimization, Learning.
@sushnt.bsky.social

ID: 20110106

linkhttps://www.cs.toronto.edu/~sachdeva/ calendar_today05-02-2009 00:37:09

414 Tweet

1,1K Followers

893 Following

Ankit Singla (@stub_as) 's Twitter Profile Photo

My new book for kids 4-8 is here! The book teaches Hindi vowel diacritics ("मात्रा") using a story of magicians, each with the power of one मात्रा, by changing which they change not just words, but also real-world things, e.g., transform a hand (हाथ) to an elephant (हाथी).

My new book for kids 4-8 is here! The book teaches Hindi vowel diacritics ("मात्रा") using a story of magicians, each with the power of one मात्रा, by changing which they change not just words, but also real-world things, e.g., transform a hand (हाथ) to an elephant (हाथी).
Varun Bhalerao (@starlabiitb) 's Twitter Profile Photo

STAR Lab completes 8 years today! Within the last year, we have: 👩🏼‍🎓 one new PhD 📄 15 refereed papers 📑 > 100 GCNs, ATELs, etc Lifetime stats: 🎇 > 700 GCNs, ATELs, etc ☄️ > 200 MPECs (asteroid observations) 📚 > 10,000 citations 📄 > 110 refereed papers

STAR Lab completes 8 years today! Within the last year, we have:
👩🏼‍🎓 one new PhD
📄 15 refereed papers
📑 > 100 GCNs, ATELs, etc
Lifetime stats:
🎇 > 700 GCNs, ATELs, etc
☄️ > 200 MPECs (asteroid observations)
📚 > 10,000 citations
📄 > 110 refereed papers
Po-Shen Loh (@poshenloh) 's Twitter Profile Photo

I've come up with a way of demystifying e (2.718) for high schoolers. poshenloh.com/e I think it is much more intuitive than how e is currently taught, and could update all of the textbooks. Full article: arxiv.org/abs/2504.10664 In the USA, the first time people learn

I've come up with a way of demystifying e (2.718) for high schoolers. poshenloh.com/e

I think it is much more intuitive than how e is currently taught, and could update all of the textbooks. Full article: arxiv.org/abs/2504.10664

In the USA, the first time people learn
François Fleuret (@francoisfleuret) 's Twitter Profile Photo

As expected, that was popular. Here is my attempt at consolidating all the answers into a list. - Prenorm: normalization in the residual blocks before the attention operation and the FFN respectively - GQA (Group Query Attention): more Q than (K, V)

Sushant Sachdeva (@sushnt) 's Twitter Profile Photo

Few people care like Jelani Nelson to fight the good fight. We need to prepare the next generation by expecting them to learn mathematics well, not just get rubber stamped grades via watered-down courses, as this bill will nudge us towards.

Dan Roy (@roydanroy) 's Twitter Profile Photo

Would love help identifying amazing ML researchers with strong connections to Canada who are currently outside Canada (thus potentially targets for recruitment as US situation deteriorates). DMs please. Retweet please.

Sushant Sachdeva (@sushnt) 's Twitter Profile Photo

chat, I am sick of the stt (speech-to-text) default on The Gmail™ Keyboard on my android phone, given how good whisper is, there must be better options? What's the best approach for typing via voice on android phones? local models/apps preferred. what's about for a mac? RT for reach pls