niki parmar (@nikiparmar09) Twitter Tweets • TwiCopy

Andrej Karpathy

3 years ago

TLDR: You can get far with: vanilla Transformer (2017). Scrape a massive (though weakly-labeled) dataset, use simple supervised learning. Multi-task. Eval in zero-shot regime. More perf expected from further model+data scaling. Eval is hard. Some parts (decoding) feel hacky.

thumb_up_off_alt415

chat_bubble_outline17

repeat26

shareShare

Neeva

@neeva

3 years ago

10/ We are releasing our model (huggingface.co/neeva/query2qu…) and golden set used for eval (huggingface.co/datasets/neeva…) on Hugging Face. Take a look at our latest blog post for more information 👁️ ⤵️ neeva.com/blog/state-of-…

thumb_up_off_alt79

chat_bubble_outline4

repeat14

shareShare

niki parmar

@nikiparmar09

3 years ago

This particular example that generates a 2min long video based on a changing story is really cool Congrats to all the authors!

thumb_up_off_alt35

chat_bubble_outline1

repeat3

shareShare

Nathan Benaich

@nathanbenaich

3 years ago

🪩The State of AI 2022 is live!🪩 In its 5th year, the #stateofai report condenses what you *need* to know in AI research, industry, safety, and politics. This open-access report is our contribution to the AI ecosystem. Here's my director's cut 🧵: stateof.ai

thumb_up_off_alt1,1K

chat_bubble_outline27

repeat482

shareShare

Essential AI

@essential_ai

2 years ago

We are excited to announce Essential AI, founded by Ashish Vaswani and niki parmar essential.ai

thumb_up_off_alt71

chat_bubble_outline4

repeat11

shareShare

Lukasz Kaiser

@lukaszkaiser

a year ago

Honored to have been awarded the 2024 NEC C&C Prize together with my wonderful friends and Transformer coauthors Ashish Vaswani Noam Shazeer niki parmar Illia (root.near) (🇺🇦, ⋈) Command A(idan) Jakob Uszkoreit Llion Jones Learn more: nec.com/en/press/20241…

thumb_up_off_alt142

chat_bubble_outline7

repeat15

shareShare

niki parmar

@nikiparmar09

10 months ago

Today is as good a day as any to share that I joined Anthropic last Dec :) Claude 3.7 is a remarkable model at complex tasks, especially coding, and I'm thrilled to have contributed to its development. From winning Pokémon badges to vibes coding, Claude's got you covered!

thumb_up_off_alt1,1K

chat_bubble_outline62

repeat43

shareShare

niki parmar

@nikiparmar09

10 months ago

I love ClaudeCode!! ❤️ Talking to it and watching it go is oddly satisfying.

thumb_up_off_alt139

chat_bubble_outline4

repeat4

shareShare

Alexander Ku

@alex_y_ku

7 months ago

(1/11) Evolutionary biology offers powerful lens into Transformers learning dynamics! Two learning modes in Transformers (in-weights & in-context) mirror adaptive strategies in evolution. Crucially, environmental predictability shapes both systems similarly.

thumb_up_off_alt163

chat_bubble_outline12

repeat26

shareShare

niki parmar

@nikiparmar09

7 months ago

Claude Opus 4 and Sonnet 4 are the best coding models, setting new records across the board. 🚀 We are pushing the limits (80.2% on SWE-Bench!!), advancing the frontier while keeping up the momentum. The benchmarks may soon become saturated but the capabilities will not!

thumb_up_off_alt87

chat_bubble_outline6

repeat5

shareShare

Aurko Roy

@happylemon56775

6 months ago

Excited to share what I worked on during my time at Meta. - We introduce a Triton-accelerated Transformer with *2-simplicial attention*—a tri-linear generalization of dot-product attention - We show how to adapt RoPE to tri-linear forms - We show 2-simplicial attention scales

thumb_up_off_alt785

chat_bubble_outline25

repeat98

shareShare