Weijia Shi (@weijiashi2) 's Twitter Profile
Weijia Shi

@weijiashi2

PhD student @uwnlp @allen_ai | Prev @MetaAI @CS_UCLA | 🏠 weijiashi.notion.site

ID: 1161707505497456640

calendar_today14-08-2019 18:33:52

862 Tweet

7,7K Followers

1,1K Following

Oreva Ahia (@orevaahia) 's Twitter Profile Photo

🎉 We’re excited to introduce BLAB: Brutally Long Audio Bench, the first benchmark for evaluating long-form reasoning in audio LMs across 8 challenging tasks, using 833+ hours of Creative Commons audio. (avg length: 51 minutes).

🎉 We’re excited to introduce BLAB: Brutally Long Audio Bench, the first benchmark for evaluating long-form reasoning in audio LMs across 8 challenging tasks, using 833+ hours of Creative Commons audio. (avg length: 51 minutes).
Yuntian Deng (@yuntiandeng) 's Twitter Profile Photo

Can we build an operating system entirely powered by neural networks? Introducing NeuralOS: towards a generative OS that directly predicts screen images from user inputs. Try it live: neural-os.com Paper: huggingface.co/papers/2507.08… Inspired by Andrej Karpathy's vision. 1/5

Han Guo (@hanguo97) 's Twitter Profile Photo

Since our initial arXiv post, several concurrent papers have introduced new architectures with log-linear properties in various forms. Two personal favorites of mine (among others) are: - Transformer-PSM by Morris Yau et al., and - Radial Attention by Xingyang and Muyang Li et

Paul Liang (@pliang279) 's Twitter Profile Photo

Building AI reasoning models with extremely long context lengths - think days, weeks, even years of context - is the next big challenge in AI. that's why i'm extremely excited about the latest work from Ao Qu Ao Qu, incoming PhD student in our group, on MEM1: RL for Memory

Yihe Deng (@yihe__deng) 's Twitter Profile Photo

🙌 We've released the full version of our paper, OpenVLThinker: Complex Vision-Language Reasoning via Iterative SFT-RL Cycles Our OpenVLThinker-v1.2 is trained through three lightweight SFT → RL cycles, where SFT first “highlights” reasoning behaviors and RL then explores and

🙌 We've released the full version of our paper, OpenVLThinker: Complex Vision-Language Reasoning via Iterative SFT-RL Cycles

Our OpenVLThinker-v1.2 is trained through three lightweight SFT → RL cycles, where SFT first “highlights” reasoning behaviors and RL then explores and
Oana Ignat 👩‍💻🎓📚🇷🇴🌍 (@oanaignatro) 's Twitter Profile Photo

Counting down the days until ACL, hosting another ACL Mentorship session, diving into timely topics: when it is so tempting to rely on AI for writing, what role do we play, and what might we be losing? #ACL2025NLP #NLProc

Niloofar (on faculty job market!) (@niloofar_mire) 's Twitter Profile Photo

🧵 Academic job market season is almost here! There's so much rarely discussed—nutrition, mental and physical health, uncertainty, and more. I'm sharing my statements, essential blogs, and personal lessons here, with more to come in the upcoming weeks! ⬇️ (1/N)

Weijia Shi (@weijiashi2) 's Twitter Profile Photo

How to write good reviews & rebuttals? We've invited 🌟 reviewers to share their expertise in person at our ACL mentorship session #ACL2025NLP next week

Niloofar (on faculty job market!) (@niloofar_mire) 's Twitter Profile Photo

I’m gonna be recruiting students thru both Language Technologies Institute | @CarnegieMellon (NLP) and CMU Engineering & Public Policy (Engineering and Public Policy) for fall 2026! If you are interested in reasoning, memorization, AI for science & discovery and of course privacy, u can catch me at ACL! Prospective students fill this form:

Zeyuan Allen-Zhu, Sc.D. (@zeyuanallenzhu) 's Twitter Profile Photo

Phase 1 of Physics of Language Models code release ✅our Part 3.1 + 4.1 = all you need to pretrain strong 8B base model in 42k GPU-hours ✅Canon layers = strong, scalable gains ✅Real open-source (data/train/weights) ✅Apache 2.0 license (commercial ok!) 🔗github.com/facebookresear…

Phase 1 of Physics of Language Models code release
✅our Part 3.1 + 4.1 = all you need to pretrain strong 8B base model in 42k GPU-hours
✅Canon layers = strong, scalable gains
✅Real open-source (data/train/weights)
✅Apache 2.0 license (commercial ok!)
🔗github.com/facebookresear…
Abhilasha Ravichander (@lasha_nlp) 's Twitter Profile Photo

✈️ I'm in Vienna for #ACL2025NLP! Would love to meet and chat about training data, factuality, transparency, doing a PhD in AI🤖, or anything else. Please say hi if you see me!☕️🍰 I am hiring PhD students + interns (shorturl.at/fZnOq), let's chat if you are looking!

Yung-Sung Chuang (@yungsungchuang) 's Twitter Profile Photo

Scaling CLIP on English-only data is outdated now… 🌍We built CLIP data curation pipeline for 300+ languages 🇬🇧We train MetaCLIP 2 without compromising English-task performance (it actually improves! 🥳It’s time to drop the language filter! 📝arxiv.org/abs/2507.22062 [1/5] 🧵

Scaling CLIP on English-only data is outdated now…

🌍We built CLIP data curation pipeline for 300+ languages
🇬🇧We train MetaCLIP 2 without compromising English-task performance (it actually improves!
🥳It’s time to drop the language filter!

📝arxiv.org/abs/2507.22062

[1/5]

🧵
Akari Asai (@akariasai) 's Twitter Profile Photo

We’re hosting a NeurIPS competition on real-world Retrieval-Augmented Generation! In addition to automatic and LLM-as-a-judge eval, we’ll feature live user feedback via our interactive RAG Arena. Stay tuned for more details and don’t forget to sign up agi-lti.github.io/MMU-RAGent/

Michi Yasunaga (@michiyasunaga) 's Twitter Profile Photo

gpt-oss (open models) are out - they can reason, can code, and can use tools like browsing and python to solve agentic tasks. Hope they are useful for the community!

Omar Khattab (@lateinteraction) 's Twitter Profile Photo

This was a really fun collab during my time at Databricks !! It’s basically a product answer to the fact that: (1) People want to optimize their agents and to specialize them for downstream preferences (no free lunch!) (2) People don’t have upfront training sets—or even

Tao Yu (@taoyds) 's Twitter Profile Photo

As computer-use agents (CUAs) handle critical digital tasks, open research is key to study their capabilities, risks. 🚀After a year, we release OpenCUA: 1) largest CUA dataset/tool, 2) training recipe, 3) ~SOTA model on OSWorld. Released to drive transparent,safe CUA research!

As computer-use agents (CUAs) handle critical digital tasks, open research is key to study their capabilities, risks.

🚀After a year, we release OpenCUA: 1) largest CUA dataset/tool, 2) training recipe, 3) ~SOTA model on OSWorld.

Released to drive transparent,safe CUA research!
Tim Althoff (@timalthoff) 's Twitter Profile Photo

I’m excited to share our new nature paper 📝, which provides strong evidence that the walkability of our built environment matters a great deal to our physical activity and health. Details in thread.🧵 nature.com/articles/s4158…

I’m excited to share our new <a href="/Nature/">nature</a> paper 📝, which provides strong evidence that the walkability of our built environment matters a great deal to our physical activity and health. 

Details in thread.🧵

nature.com/articles/s4158…
Pratyush Maini (@pratyushmaini) 's Twitter Profile Photo

1/Pretraining is hitting a data wall; scaling raw web data alone leads to diminishing returns. Today DatologyAI shares BeyondWeb, our synthetic data approach & all the learnings from scaling it to trillions of tokens🧑🏼‍🍳 - 3B LLMs beat 8B models🚀 - Pareto frontier for performance

1/Pretraining is hitting a data wall; scaling raw web data alone leads to diminishing returns. Today <a href="/datologyai/">DatologyAI</a> shares BeyondWeb, our synthetic data approach &amp; all the learnings from scaling it to trillions of tokens🧑🏼‍🍳
- 3B LLMs beat 8B models🚀
- Pareto frontier for performance