Tsendsuren (@tsendeemts) Twitter Tweets • TwiCopy

Tsendsuren

@tsendeemts

+ Follow

Research scientist at Google Deepmind | previously at Microsoft Research and Postdoc at UMass. Views are my own. Most tweets in Mongolian 🇲🇳.

ID: 105984261

linkhttp://www.tsendeemts.com/ calendar_today18-01-2010 03:58:52

21,21K Tweet

4,4K Followers

587 Following

alexey

@sekachov

5 months ago

iOS 26 looks amazing

thumb_up_off_alt6,6K

chat_bubble_outline63

repeat583

shareShare

Jane Manchun Wong

@wongmjane

5 months ago

New iOS design just dropped

thumb_up_off_alt10,10K

chat_bubble_outline91

repeat703

shareShare

Tsendsuren

@tsendeemts

5 months ago

Anyone else think that this looks familiar? 🤔 Windows Vista was ceased.

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Tsendsuren

@tsendeemts

5 months ago

Apparently everyone else already thought so. I was late to the party 😅

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Guillaume Lample @ NeurIPS 2024

@guillaumelample

5 months ago

Very excited to release our first reasoning model, Magistral. We released the weights of Magistral Small alongside a paper that presents our approach, online RL infrastructure, and findings.

thumb_up_off_alt497

chat_bubble_outline7

repeat64

shareShare

The history of AI from 2012 to present shows each paradigm solving previous limitations while revealing new ones. The next state-of-the-art will be Supervised RL for reasoning. It is fundamentally bottlenecked by the need for verifiable environments. 2/7

thumb_up_off_alt59

chat_bubble_outline1

repeat4

shareShare

Tsendsuren

@tsendeemts

5 months ago

Looks interesting! Here is what I did for fast adaptation 5 years ago: arxiv.org/pdf/2009.01803. Curios to see the advancement since then.

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Christian Szegedy

@chrszegedy

5 months ago

The Inception paper arxiv.org/abs/1409.4842 was awarded the Longuet-Higgins prize (Test of time). The architecture represented a significant step forward in inference efficiency especially on CPU and variants of Inception networks were used in Google products for years.

thumb_up_off_alt295

chat_bubble_outline11

repeat33

shareShare

MiniMax (official)

@minimax__ai

4 months ago

Day 1/5 of #MiniMaxWeek: We’re open-sourcing MiniMax-M1, our latest LLM — setting new standards in long-context reasoning. - World’s longest context window: 1M-token input, 80k-token output - State-of-the-art agentic use among open-source models - RL at unmatched efficiency:

thumb_up_off_alt1,1K

chat_bubble_outline55

repeat236

shareShare

Mongol Tsakhia ELBEGDORJ

@elbegdorj

4 months ago

Eighty U.S. Senators are backing tariffs on Russian oil buyers. As MONGOLIA, democracy’s the lone outpost, we face a hard truth: landlocked and bordered by two giants. We have no alternative but to import fuel from the north. I hope our unique position earns an understanding and

thumb_up_off_alt335

chat_bubble_outline22

repeat91

shareShare

𝚐𝔪𝟾𝚡𝚡𝟾

@gm8xx8

4 months ago

𝑨𝒐𝑬 DeepSeek-R1T-Chimera, a 671B hybrid model built by merging only routed experts from DeepSeek-R1 into V3-0324 - No fine-tuning, no distillation - Matches R1 reasoning while using ~40% fewer output tokens (~2.5× more concise) - Fully functional out-of-the-box Method -

thumb_up_off_alt101

chat_bubble_outline0

repeat15

shareShare

Tsendsuren

@tsendeemts

4 months ago

Гүүглээс Gemini 2.5 хиймэл оюуны загвар хөгжүүлэлтийн тухай техникал рефорт гарсан шүү: storage.googleapis.com/deepmind-media…

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Tsendsuren

@tsendeemts

4 months ago

Ah what a great idea! Put teacher at the center of training.

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Jason Weston

@jaseweston

4 months ago

Reasoning, Attention & Memory Workshop @ COLM Submission Deadline: June 23, 2025 -- Today!

thumb_up_off_alt51

chat_bubble_outline1

repeat13

shareShare

Tu Vu

@tuvllms

4 months ago

Excited to share that our paper on model merging at scale has been accepted to Transactions on Machine Learning Research (TMLR). Huge congrats to my intern Prateek Yadav and our awesome co-authors Jonathan Lai, Alexandra Chronopoulou, Manaal Faruqui, Mohit Bansal, and Tsendsuren 🎉!!

thumb_up_off_alt20

chat_bubble_outline0

repeat2

shareShare

Tsendsuren

@tsendeemts

4 months ago

This work got accepted at Transactions on Machine Learning Research (TMLR). Congratulations to Prateek Yadav and my co-authors. Also, thank you to the reviewers and editors for their time.

thumb_up_off_alt12

chat_bubble_outline0

repeat3

shareShare

Tsendsuren

@tsendeemts

4 months ago

За ус бдаггүй гадаа бороонд нүцгэн зогсдог юм уу 😅

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Oreva Ahia

@orevaahia

4 months ago

🎉 We’re excited to introduce BLAB: Brutally Long Audio Bench, the first benchmark for evaluating long-form reasoning in audio LMs across 8 challenging tasks, using 833+ hours of Creative Commons audio. (avg length: 51 minutes).

thumb_up_off_alt159

chat_bubble_outline2

repeat44

shareShare

Tsendsuren

@tsendeemts

3 months ago

Two of my former interns Trapit Bansal and Prateek Yadav are at Meta Superintelligence. Excited for what they are gonna build together 🥳

thumb_up_off_alt136

chat_bubble_outline1

repeat4

shareShare

Arion Das || Gen AI Research || LLMs || NLP

@ariondas

3 months ago

Tsendsuren Prateek Yadav I had tried implementing one of your papers: "Infini Transformer" from scratch: github.com/ArionDas/Infin… But you never shared the code with me when I requested 😄🙌

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare