Asher Trockman (@ashertrockman) 's Twitter Profile
Asher Trockman

@ashertrockman

CS PhD student at Carnegie Mellon University

ID: 4098537032

linkhttp://ashertrockman.com calendar_today02-11-2015 05:32:00

185 Tweet

657 Followers

216 Following

Jeff Dean (@jeffdean) 's Twitter Profile Photo

We've had an account for Google Research for a while, but we're going to start posting more info about the work done by Google Research here. Follow for awesome research content!

Gabriel Bianconi (@gabrielbianconi) 's Twitter Profile Photo

SuperDial was the first company to deploy TensorZero in production. Really exciting to see the progress and impact they've made over the past year+! Congratulations to Sam Schwager, Harrison Caruthers, and the SuperDial team — well deserved! 🍾

Delip Rao e/σ (@deliprao) 's Twitter Profile Photo

Gemini 2.5 Pro is the most underrated model. Extraordinary intelligence for free. I am not sure why people don’t talk about this all the time.

Shane Gu (@shaneguml) 's Twitter Profile Photo

NeurIPS workshop proposal rejected. We had AMAZING speakers, and a great list of organizers. No rebuttal phase. No feedback. No dense reward :( Since I work full-time on Gemini and have zero time to publish, this was my chance to contribute to academia 😞

Simo Ryu (@cloneofsimo) 's Twitter Profile Photo

It takes LITERALLY 2 min to setup gemini-cli and Im telling you its incredible. No need to setup credit card or anything. Its also super intutive and safe. you should definitely try it out 100% recommend ``` npm install -g @google/gemini-cli gemini ```

It takes LITERALLY 2 min to setup gemini-cli and Im telling you its incredible. No need to setup credit card or anything. Its also super intutive and safe. you should definitely try it out 100% recommend

```
npm install -g @google/gemini-cli
gemini
```
Albert Gu (@_albertgu) 's Twitter Profile Photo

I converted one of my favorite talks I've given over the past year into a blog post. "On the Tradeoffs of SSMs and Transformers" (or: tokens are bullshit) In a few days, we'll release what I believe is the next major advance for architectures.

I converted one of my favorite talks I've given over the past year into a blog post.

"On the Tradeoffs of SSMs and Transformers"
(or: tokens are bullshit)

In a few days, we'll release what I believe is the next major advance for architectures.
Aya Somai (@aya_somai_) 's Twitter Profile Photo

My favorite reading of the week by Yiding Jiang: Next era is not about learning from data but deciding what data to learn from. yidingjiang.github.io/blog/post/expl…

Simone Scardapane (@s_scardapane) 's Twitter Profile Photo

*Antidistillation Sampling* by Yash Savani Asher Trockman Zico Kolter et al. They modify the logits of a model with a penalty term that poisons potential distillation attempts (by estimating the downstream distillation loss). arxiv.org/abs/2504.13146

*Antidistillation Sampling*
by <a href="/yashsavani_/">Yash Savani</a> <a href="/ashertrockman/">Asher Trockman</a> <a href="/zicokolter/">Zico Kolter</a> et al.

They modify the logits of a model with a penalty term that poisons potential distillation attempts (by estimating the downstream distillation loss).

arxiv.org/abs/2504.13146
Sukjun (June) Hwang (@sukjun_hwang) 's Twitter Profile Photo

Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data

Albert Gu (@_albertgu) 's Twitter Profile Photo

Tokenization is just a special case of "chunking" - building low-level data into high-level abstractions - which is in turn fundamental to intelligence. Our new architecture, which enables hierarchical *dynamic chunking*, is not only tokenizer-free, but simply scales better.

Tokenization is just a special case of "chunking" - building low-level data into high-level abstractions - which is in turn fundamental to intelligence.

Our new architecture, which enables hierarchical *dynamic chunking*, is not only tokenizer-free, but simply scales better.
Nicholas Roberts (@nick11roberts) 's Twitter Profile Photo

🎉 Excited to share that our paper "Pretrained Hybrids with MAD Skills" was accepted to Conference on Language Modeling 2025! We introduce Manticore - a framework for automatically creating hybrid LMs from pretrained models without training from scratch. 🧵[1/n]

Dylan Foster 🐢 (@canondetortugas) 's Twitter Profile Photo

For those at ICML, Audrey will be presenting this paper at the 4:30 poster session this afternoon! West Exhibition Hall B2-B3 W-1009

For those at ICML, Audrey will be presenting this paper at the 4:30 poster session this afternoon! West Exhibition Hall B2-B3 W-1009
Aditi Raghunathan (@adtraghunathan) 's Twitter Profile Photo

Huge congratulations to Vaishnavh, Chen and Charles on the outstanding paper award 🎉 We will be presenting our #ICML2025 work on creativity in the Oral 3A Reasoning session (West Exhibition Hall C) 10 - 11 am PT. Or please stop by our poster right after @ East Exhibition

Prima Mente (@primamente) 's Twitter Profile Photo

1/ Today we announce Pleiades, a series of epigenetic foundation models (90M→7B params) trained on 1.9T tokens of human methylation & genomic data. Pleiades accurately models epigenetics for genomic track prediction, generation & neurodegenerative disease detection from cfDNA,

1/ Today we announce Pleiades, a series of epigenetic foundation models (90M→7B params) trained on 1.9T tokens of human methylation &amp; genomic data. Pleiades accurately models epigenetics for genomic track prediction, generation &amp; neurodegenerative disease detection from cfDNA,