Weijia Shi (@weijiashi2) Twitter Tweets • TwiCopy

Oreva Ahia

4 months ago

🎉 We’re excited to introduce BLAB: Brutally Long Audio Bench, the first benchmark for evaluating long-form reasoning in audio LMs across 8 challenging tasks, using 833+ hours of Creative Commons audio. (avg length: 51 minutes).

thumb_up_off_alt159

chat_bubble_outline2

repeat44

shareShare

Yuntian Deng

@yuntiandeng

4 months ago

Can we build an operating system entirely powered by neural networks? Introducing NeuralOS: towards a generative OS that directly predicts screen images from user inputs. Try it live: neural-os.com Paper: huggingface.co/papers/2507.08… Inspired by Andrej Karpathy's vision. 1/5

thumb_up_off_alt159

chat_bubble_outline6

repeat34

shareShare

Han Guo

@hanguo97

4 months ago

Since our initial arXiv post, several concurrent papers have introduced new architectures with log-linear properties in various forms. Two personal favorites of mine (among others) are: - Transformer-PSM by Morris Yau et al., and - Radial Attention by Xingyang and Muyang Li et

thumb_up_off_alt278

chat_bubble_outline6

repeat40

shareShare

Paul Liang

@pliang279

3 months ago

Building AI reasoning models with extremely long context lengths - think days, weeks, even years of context - is the next big challenge in AI. that's why i'm extremely excited about the latest work from Ao Qu Ao Qu, incoming PhD student in our group, on MEM1: RL for Memory

thumb_up_off_alt80

chat_bubble_outline0

repeat9

shareShare

Yihe Deng

@yihe__deng

3 months ago

🙌 We've released the full version of our paper, OpenVLThinker: Complex Vision-Language Reasoning via Iterative SFT-RL Cycles Our OpenVLThinker-v1.2 is trained through three lightweight SFT → RL cycles, where SFT first “highlights” reasoning behaviors and RL then explores and

thumb_up_off_alt175

chat_bubble_outline2

repeat40

shareShare

Oana Ignat 👩‍💻🎓📚🇷🇴🌍

@oanaignatro

3 months ago

Counting down the days until ACL, hosting another ACL Mentorship session, diving into timely topics: when it is so tempting to rely on AI for writing, what role do we play, and what might we be losing? #ACL2025NLP #NLProc

thumb_up_off_alt15

chat_bubble_outline0

repeat6

shareShare

Niloofar (on faculty job market!)

@niloofar_mire

3 months ago

🧵 Academic job market season is almost here! There's so much rarely discussed—nutrition, mental and physical health, uncertainty, and more. I'm sharing my statements, essential blogs, and personal lessons here, with more to come in the upcoming weeks! ⬇️ (1/N)

thumb_up_off_alt245

chat_bubble_outline3

repeat36

shareShare

Weijia Shi

@weijiashi2

3 months ago

How to write good reviews & rebuttals? We've invited 🌟 reviewers to share their expertise in person at our ACL mentorship session #ACL2025NLP next week

thumb_up_off_alt51

chat_bubble_outline1

repeat4

shareShare

Niloofar (on faculty job market!)

@niloofar_mire

3 months ago

I’m gonna be recruiting students thru both Language Technologies Institute | @CarnegieMellon (NLP) and CMU Engineering & Public Policy (Engineering and Public Policy) for fall 2026! If you are interested in reasoning, memorization, AI for science & discovery and of course privacy, u can catch me at ACL! Prospective students fill this form:

thumb_up_off_alt275

chat_bubble_outline4

repeat50

shareShare

Zeyuan Allen-Zhu, Sc.D.

@zeyuanallenzhu

3 months ago

Phase 1 of Physics of Language Models code release ✅our Part 3.1 + 4.1 = all you need to pretrain strong 8B base model in 42k GPU-hours ✅Canon layers = strong, scalable gains ✅Real open-source (data/train/weights) ✅Apache 2.0 license (commercial ok!) 🔗github.com/facebookresear…

thumb_up_off_alt567

chat_bubble_outline8

repeat93

shareShare

Abhilasha Ravichander

@lasha_nlp

3 months ago

✈️ I'm in Vienna for #ACL2025NLP! Would love to meet and chat about training data, factuality, transparency, doing a PhD in AI🤖, or anything else. Please say hi if you see me!☕️🍰 I am hiring PhD students + interns (shorturl.at/fZnOq), let's chat if you are looking!

thumb_up_off_alt96

chat_bubble_outline9

repeat8

shareShare

Yung-Sung Chuang

@yungsungchuang

3 months ago

Scaling CLIP on English-only data is outdated now… 🌍We built CLIP data curation pipeline for 300+ languages 🇬🇧We train MetaCLIP 2 without compromising English-task performance (it actually improves! 🥳It’s time to drop the language filter! 📝arxiv.org/abs/2507.22062 [1/5] 🧵

thumb_up_off_alt290

chat_bubble_outline3

repeat80

shareShare

Akari Asai

@akariasai

3 months ago

We’re hosting a NeurIPS competition on real-world Retrieval-Augmented Generation! In addition to automatic and LLM-as-a-judge eval, we’ll feature live user feedback via our interactive RAG Arena. Stay tuned for more details and don’t forget to sign up agi-lti.github.io/MMU-RAGent/

thumb_up_off_alt97

chat_bubble_outline0

repeat16

shareShare

Artidoro Pagnoni

@artidoropagnoni

3 months ago

Thrilled to share that our Byte Latent Transformer won an Outstanding Paper Award at ACL 2025! 🏆

thumb_up_off_alt262

chat_bubble_outline15

repeat28

shareShare

Robin Jia

@robinomial

3 months ago

Super excited for our new #ACL2025 workshop tomorrow on LLM Memorization, featuring talks by the fantastic Reza Shokri Yanai Elazar and Niloofar (✈️ ACL), and with a dream team of co-organizers Johnny Tian-Zheng Wei Verna Dankers Pietro Lesci Tiago Pimentel Pratyush Maini Yangsibo Huang !

thumb_up_off_alt50

chat_bubble_outline0

repeat11

shareShare

Michi Yasunaga

@michiyasunaga

3 months ago

gpt-oss (open models) are out - they can reason, can code, and can use tools like browsing and python to solve agentic tasks. Hope they are useful for the community!

thumb_up_off_alt121

chat_bubble_outline0

repeat1

shareShare

Omar Khattab

@lateinteraction

3 months ago

This was a really fun collab during my time at Databricks !! It’s basically a product answer to the fact that: (1) People want to optimize their agents and to specialize them for downstream preferences (no free lunch!) (2) People don’t have upfront training sets—or even

thumb_up_off_alt170

chat_bubble_outline10

repeat20

shareShare

Tao Yu

@taoyds

3 months ago

As computer-use agents (CUAs) handle critical digital tasks, open research is key to study their capabilities, risks. 🚀After a year, we release OpenCUA: 1) largest CUA dataset/tool, 2) training recipe, 3) ~SOTA model on OSWorld. Released to drive transparent,safe CUA research!

thumb_up_off_alt113

chat_bubble_outline2

repeat26

shareShare

Tim Althoff

@timalthoff

3 months ago

I’m excited to share our new nature paper 📝, which provides strong evidence that the walkability of our built environment matters a great deal to our physical activity and health. Details in thread.🧵 nature.com/articles/s4158…

I’m excited to share our new <a href="/Nature/">nature</a> paper 📝, which provides strong evidence that the walkability of our built environment matters a great deal to our physical activity and health.

Details in thread.🧵

nature.com/articles/s4158…

thumb_up_off_alt2,2K

chat_bubble_outline46

repeat529

shareShare

Pratyush Maini

@pratyushmaini

3 months ago

1/Pretraining is hitting a data wall; scaling raw web data alone leads to diminishing returns. Today DatologyAI shares BeyondWeb, our synthetic data approach & all the learnings from scaling it to trillions of tokens🧑🏼‍🍳 - 3B LLMs beat 8B models🚀 - Pareto frontier for performance

1/Pretraining is hitting a data wall; scaling raw web data alone leads to diminishing returns. Today <a href="/datologyai/">DatologyAI</a> shares BeyondWeb, our synthetic data approach & all the learnings from scaling it to trillions of tokens🧑🏼‍🍳
- 3B LLMs beat 8B models🚀
- Pareto frontier for performance

thumb_up_off_alt559

chat_bubble_outline18

repeat92

shareShare