Taiji Suzuki (@btreetaiji) Twitter Tweets • TwiCopy

Quanquan Gu

@quanquangu

6 months ago

You don’t need a PhD to be a great AI researcher, as long as you’re standing on the shoulders of 100 who have one.

thumb_up_off_alt2,2K

chat_bubble_outline57

repeat152

shareShare

Taiji Suzuki

@btreetaiji

6 months ago

IQUNIXのロープロファイルキーボード，なかなか良い

thumb_up_off_alt11

chat_bubble_outline0

repeat0

shareShare

Very excited to see this agreement. I was able to tour Commonwealth Fusion Systems's facility in December last year and was excited by the potential! Working fusion energy could be game changing for the world.

thumb_up_off_alt374

chat_bubble_outline20

repeat41

shareShare

Taiji Suzuki

@btreetaiji

5 months ago

今回のNeurIPSの査読ルール，かなり効いている（査読を提出しない査読者には共著者にメールが行く&自分の論文の査読結果が見られなくなる）．すべての論文で査読が揃っている．これまでにはあり得なかった現象．

thumb_up_off_alt151

chat_bubble_outline1

repeat21

shareShare

ビーム | Seiya Tokui

@beam2d

5 months ago

Our paper on metadata conditioning in LM pretraining arxiv.org/abs/2504.17562 is accepted to CoLM 2025! Huge thanks to all coauthors and reviewers!

thumb_up_off_alt30

chat_bubble_outline0

repeat5

shareShare

Taiji Suzuki

@btreetaiji

5 months ago

訓練時に文章の文頭にmeta-dataを挿入することで学習の効率がどのように変化するかを調べた研究がCOLM2025に採択されました．PFNの方々と共同研究させていただきました． arxiv.org/pdf/2504.17562

thumb_up_off_alt83

chat_bubble_outline0

repeat14

shareShare

Taiji Suzuki

@btreetaiji

5 months ago

D2の西川君による研究です．通常のAttentionを線形Attentionに蒸留する際に，必要な次元を統計的自由度を用いて決定する手法を提案しています．各層の"複雑さ"を定量化することができて，次元を決め打ちするよりも効率的な近似が可能です．

thumb_up_off_alt89

chat_bubble_outline0

repeat14

shareShare

Daisuke Okanohara / 岡野原大輔

@hillbig

5 months ago

LLM訓練時にデータを説明するメタデータを挿入することで学習効率が変わるのかを調べた東大鈴木研とPFNの共同研究がCOLM 2025に採択されました。学習効率を改善する利点が大きいが、隠れた情報を推定する能力を学習する機会が失われ、後続タスクの条件次第でトレードオフがあることを示しました

thumb_up_off_alt168

chat_bubble_outline1

repeat34

shareShare

Taiji Suzuki

@btreetaiji

5 months ago

Mixture of Expertの学習ダイナミクスに関する研究がICML2025に採択されました．インターンの松谷君(当時B3!)と弊研究室M1の川田君主導の研究です．単一ネットワークでは学習が難しい問題でも，ゲートネットも一緒に学習することで学習可能になることを示しています． x.gd/hDIW3

thumb_up_off_alt216

chat_bubble_outline0

repeat26

shareShare

Ryota Tomioka

@ryotat

5 months ago

BioEmu is now published in Science! 🎉 I’m deeply grateful to the incredible highly collaborative team that made this happen. Can't wait to see how the community uses BioEmu to better understand protein structure ensemble and their implilcations in biology and medicine.

thumb_up_off_alt22

chat_bubble_outline0

repeat8

shareShare

Taiji Suzuki

@btreetaiji

5 months ago

文脈内学習の状況にて，Transformerはsoftmax注意によって「テスト時に」特徴学習ができることを示しました．さらに，そのテスト時の学習複雑さは情報理論的下限に近いレートを達成し，「生成指数」と呼ばれる量で特徴づけられることを示しました．ICML2025で発表します． x.gd/WXBCy

thumb_up_off_alt218

chat_bubble_outline0

repeat33

shareShare

hardmaru

@hardmaru

5 months ago

Google’s Gemini 2.5 paper has 3295 authors arxiv.org/abs/2507.06261

thumb_up_off_alt3,3K

chat_bubble_outline89

repeat449

shareShare

Simon Shaolei Du

@simonshaoleidu

5 months ago

Can transformers analyze code efficiently? ✅ Yes. We prove transformers efficiently handle real compiler tasks (AST construction, symbol resolution, type infer) using only log size—while RNNs require linear size (in input length). Paper: arxiv.org/abs/2410.14706 #COLM2025

thumb_up_off_alt363

chat_bubble_outline4

repeat58

shareShare

Taiji Suzuki

@btreetaiji

5 months ago

Unfortunately, I cannot attend ICML this year. But, my students and collaborators will present our work in the main conference. Please stop by our posters!

thumb_up_off_alt47

chat_bubble_outline0

repeat4

shareShare

asap

@asap2650

5 months ago

arxiv.org/abs/2507.10532 これが本当なら、残念ながらQwen系の強化学習論文の信憑性は完全に無くなってしまった。つまみ食い程度しかAI系の論文を読んでない自分ですら、QwenがLlamaよりも強化学習で数学の能力が上がる結果を見たことあるから、多くの研究者がQwen使ってたんじゃないかな。残念

thumb_up_off_alt819

chat_bubble_outline1

repeat120

shareShare

OpenAI

@openai

5 months ago

ChatGPT can now do work for you using its own computer. Introducing ChatGPT agent—a unified agentic system combining Operator’s action-taking remote browser, deep research’s web synthesis, and ChatGPT’s conversational strengths.

thumb_up_off_alt13,13K

chat_bubble_outline650

repeat2,2K

shareShare

Taiji Suzuki

@btreetaiji

5 months ago

今年の確率数理の試験は易化させてみた．

thumb_up_off_alt63

chat_bubble_outline5

repeat8

shareShare

Taiji Suzuki

@btreetaiji

5 months ago

少し遅いアナウンスですが，日本評論社から出版中の「数学とAIのこれまで（とこれから）」に以前執筆した記事「生成AIの数理」+後日談が掲載されています．その他の豪華な先生方の記事も大変勉強になります．

thumb_up_off_alt96

chat_bubble_outline0

repeat13

shareShare

Chen-Yu Lee

@chl260

5 months ago

Thrilled to introduce "𝗗𝗲𝗲𝗽 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵𝗲𝗿 𝘄𝗶𝘁𝗵 𝗧𝗲𝘀𝘁-𝗧𝗶𝗺𝗲 𝗗𝗶𝗳𝗳𝘂𝘀𝗶𝗼𝗻," a new deep research agent designed to mimic the iterative nature of human research, complete with cycles of planning, drafting, and revision. 🚀🚀 arxiv.org/pdf/2507.16075

thumb_up_off_alt451

chat_bubble_outline11

repeat82

shareShare

Yiping Lu

@2prime_pku

5 months ago

Anyone knows adam?

thumb_up_off_alt3,3K

chat_bubble_outline208

repeat327

shareShare

Taiji Suzuki

Quanquan Gu

Taiji Suzuki

Jeff Dean

Taiji Suzuki

ビーム | Seiya Tokui

Taiji Suzuki

Taiji Suzuki

Daisuke Okanohara / 岡野原 大輔

Taiji Suzuki

Ryota Tomioka

Taiji Suzuki

hardmaru

Simon Shaolei Du

Taiji Suzuki

asap

OpenAI

Taiji Suzuki

Taiji Suzuki

Chen-Yu Lee

Yiping Lu

Daisuke Okanohara / 岡野原大輔