Shikhar (@shikharmurty) 's Twitter Profile
Shikhar

@shikharmurty

Final year PhD student @StanfordNLP

Work on:
- Inductive Biases: Compositionality, syntax trees, generalization
- Digital Agents: Exploration, RL

ID: 956335588453498885

linkhttp://www.shikharmurty.com calendar_today25-01-2018 01:19:06

576 Tweet

2,2K Followers

182 Following

Shikhar (@shikharmurty) 's Twitter Profile Photo

📝Stanford HAI wrote a pretty cool article summarizing some of our work on unsupervised web-agents that learn through open-ended exploration.👇

Brian Roemmele (@brianroemmele) 's Twitter Profile Photo

“NNetNav: Unsupervised Learning of Browser Agents Through Environment Interaction in the Wild” A powerful paper. arxiv.org/abs/2410.02907

Shikhar (@shikharmurty) 's Twitter Profile Photo

👨‍💻Tokenization errors in LLMs have the same vibe as off-by-one errors in software engineering. We develop and make progress on LLMs that can consume *bytes* directly (no tokenization needed!)

Xing Han Lu (@xhluca) 's Twitter Profile Photo

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories We are releasing the first benchmark to evaluate how well automatic evaluators, such as LLM judges, can evaluate web agent trajectories. We find that rule-based evals underreport success rates, and

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories  

We are releasing the first benchmark to evaluate how well automatic evaluators, such as LLM judges, can evaluate web agent trajectories.

We find that rule-based evals underreport success rates, and
Shikhar (@shikharmurty) 's Twitter Profile Photo

Note #1 about TreeReg: At ICLR 2023, we showed that context-freeness of span vectors predicts compositional generalization in transformers (arxiv.org/abs/2211.01288). Pratyusha Sharma and I had over 10 poster attendees asking us about a regularizer based on this idea. It took some

Shikhar (@shikharmurty) 's Twitter Profile Photo

Note #2 about TreeReg (faster grokking): At ACL 2023, we introduced structural grokking — where extended training lets Transformers discover hierarchical structure and generalize OOD, even when shortcuts work in-domain: arxiv.org/abs/2305.18741 With TreeReg, this transition is

TEDAI San Francisco (@tedaisf) 's Twitter Profile Photo

🐋 Can AI help us understand whales — and ourselves? 📷 New TED Talk recorded at TEDAI San Francisco is live! Massachusetts Institute of Technology (MIT) researcher Pratyusha Sharma explores how machine learning is decoding the language of sperm whales — opening new frontiers in AI, linguistics & nature. ted.com/talks/pratyush…

🐋 Can AI help us understand whales — and ourselves? 📷 New TED Talk recorded at <a href="/TEDAISF/">TEDAI San Francisco</a> is live!

<a href="/MIT/">Massachusetts Institute of Technology (MIT)</a> researcher <a href="/pratyusha_PS/">Pratyusha Sharma</a> explores how machine learning is decoding the language of sperm whales — opening new frontiers in AI, linguistics &amp; nature.  

ted.com/talks/pratyush…
Kabir (@kabirahuja004) 's Twitter Profile Photo

I will be presenting 👇work at #NAACL2025 tomorrow (May 2) from 12 pm in Ballroom A. Please stop by if curious about inductive biases in transformers, generalization, and applying Bayesian models of cognition for understanding language models.

Rajdeep Sardesai (@sardesairajdeep) 's Twitter Profile Photo

I hope the IMF which gave a $ 1 billion loan to Pakistan last night realises that it has BLOOD ON ITS HANDS. India abstained, but shouldn’t many others who speak of ‘zero tolerance to terror’ have joined us ?

Shikhar (@shikharmurty) 's Twitter Profile Photo

Some life updates: 1. Defended my thesis, "Building the learning-from-interaction pipeline for LLMs," on LLM browser agents that learn autonomously on digital environments, and inductive biases for compositionality. 2. Moved to NYC to start at Google Deepmind Language, where I