Leo Du (@leoduw) 's Twitter Profile
Leo Du

@leoduw

Positive semi-nondeterministic PhD student @jhuclsp currently visiting #rycolab ETH | previously @uwcse | math junkie | profile pic reads "the 8th Busy Beaver"

ID: 827808071778918400

calendar_today04-02-2017 09:16:39

190 Tweet

182 Followers

210 Following

Jason Eisner (@adveisner) 's Twitter Profile Photo

Q: Does my LM leak probability onto infinite strings? A: For RNNs and PFSAs you need to test, but Transformers always generate EOS in finite time (prob=1). 🤔First we need to formalize the question… cs.jhu.edu/~jason/papers/… #ACL2023 poster Tue 11am w/Leo Du @ryandcotterell et al

Leo Du (@leoduw) 's Twitter Profile Photo

Following up a weekend effort by another weekend effort: llama2. rs 🦀 github.com/leo-du/llama2.… In a single Rust file w/ * zero dependencies (i.e. custom rng w/ PCG) * zero lines of `unsafe` code (very 🦀!) * support user prompts * (almost) same performance

Dan Roy (@roydanroy) 's Twitter Profile Photo

One of the fundamental problems with probability notation in machine learning is due to the fact that few people really have a firm grasp on conditioning from a measure theoretical perspective. Another issue: random variables versus indexed collections of probability spaces.

Distributed AI Research Institute is on Mastodon (@dairinstitute) 's Twitter Profile Photo

Congratulations to Rylan Schaeffer, Brando Miranda, Sanmi Koyejo for winning a best paper award at NeurIPS for this insightful paper. Are Emergent Abilities of Large Language Models a Mirage? arxiv.org/abs/2304.15004

Afra Amini (@afra_amini) 's Twitter Profile Photo

If you are interested in knowing how you can do energy-based sampling from language models, make sure to check our #NeurIPS23 paper titled “Structured Voronoi Sampling”...🧵 arxiv.org/pdf/2306.03061…

Sanmi Koyejo (@sanmikoyejo) 's Twitter Profile Photo

"Are Emergent Abilities of Large Language Models a Mirage?" is a NeurIPS outstanding paper!🙌🏿 Congrats especially to the students Rylan Schaeffer Brando Miranda & other awardees. If you want to learn more, check out the oral & poster 👇🏿this afternoon (Dec 14) 1/2

"Are Emergent Abilities of Large Language Models a Mirage?" is a NeurIPS outstanding paper!🙌🏿

Congrats especially to the students  <a href="/RylanSchaeffer/">Rylan Schaeffer</a> <a href="/BrandoHablando/">Brando Miranda</a> &amp; other awardees.

If you want to learn more, check out the oral &amp; poster 👇🏿this afternoon (Dec 14) 
1/2
Will Crichton (@tonofcrates) 's Twitter Profile Photo

New paper out w/ Shriram Krishnamurthi (primary: Bluesky) accepted to OOPSLA'24: a psychometric analysis of programming language learning. We added ~200 quiz questions to a popular book on Rust and collected ~1,000,000 answers from ~60,000 people over 1 year. arxiv.org/abs/2401.01257

Justin T Chiu (@justintchiu) 's Twitter Profile Photo

wrote a short note on using parallel scans for backprop: justintchiu.com/blog/pscan_dif… turns out there was already a paper on this too! arxiv.org/abs/1907.10134

Emily Riehl (@emilyriehl) 's Twitter Profile Photo

Dominic Verity, Mario Carneiro and I just announced a new project to formalize some aspects of ∞-category theory in #Lean via the notion of an ∞-cosmos.

Leo Du (@leoduw) 's Twitter Profile Photo

Love this distinction between “compiler optimization” and “programming model”. Coming back to a common example, part of Rust’s success is its programming model. Also, great talk!