Rio Yokota (@rioyokota) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

I am honored to announce that my Ph.D. thesis has been selected as the winner of the Seiichi Tejima Research Award from Science Tokyo (formerly Tokyo Tech). Thank you all for supporting my great Ph.D. life!

thumb_up_off_alt284

chat_bubble_outline23

repeat16

shareShare

Oleksii Kuchaiev

@kuchaev

4 months ago

We are excited to release new Llama-Nemotron models. These models allow you to set reasoning ON/OFF during runtime. We also release all the post-training data under CC-BY-4! Try it now on build.nvidia.com/nvidia/llama-3… HF collection: huggingface.co/collections/nv…

thumb_up_off_alt194

chat_bubble_outline8

repeat42

shareShare

Kazuki Fujii

@okoge_kaz

4 months ago

FP8学習を行う上で理解しておくべき技術について実装を添えて解説したブログのPart 1を書きました！！ GTC25の発表で低精度学習への期待が高まっていますので、こちらもぜひ！ zenn.dev/kaz20/articles…

thumb_up_off_alt161

chat_bubble_outline2

repeat29

shareShare

Taishi Nakamura@ICLR2025🇸🇬

@setuna7777_2

4 months ago

LLM-jpのMoEモデルの事前学習をしました〜 8x13Bは、いままでllm-jpで公開されてきた中で一番良い性能になっていますぜひお試しください〜ライセンスもApache2なので自由に使えます〜 huggingface.co/llm-jp/llm-jp-… huggingface.co/llm-jp/llm-jp-… huggingface.co/llm-jp/llm-jp-…

thumb_up_off_alt95

chat_bubble_outline0

repeat20

shareShare

Reads with Ravi

@readswithravi

4 months ago

10 Concepts That Explain The Modern World:

thumb_up_off_alt678

chat_bubble_outline9

repeat121

shareShare

Naoaki Okazaki

@chokkanorg

3 months ago

Swallow LeaderboardにGemma 3 1B, 4B, 12B, 27B, GPT-4 (gpt-4-0613), GPT-4.5 (gpt-4.5-preview-2025-02-27), o1 (o1-2024-12-17) を追加しました。日本語MT-BenchのトップはGPT-4.5 (0.8840) ですが、それにGemma 3 27B IT (0.8550) が続くというのは凄いです。 swallow-llm.github.io/leaderboard/in…

thumb_up_off_alt117

chat_bubble_outline0

repeat31

shareShare

Taishi Nakamura@ICLR2025🇸🇬

@setuna7777_2

3 months ago

I'll be attending ICLR 2025 in Singapore 🇸🇬! Our work will be featured in: 🔬 Two papers at the main conference: - "Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization" 📍 Hall 3 + Hall 2B #277 📅 Sat, Apr 26 | ⏰ 10AM-12:30PM (+08) - "Agent

thumb_up_off_alt82

chat_bubble_outline0

repeat13

shareShare

Kazuki Fujii

@okoge_kaz

3 months ago

Our Swallow-Code was featured! Thanks a lot Daniel van Strien Beyond Python, we aim to expand our approach to other major programming languages, like HuggingFace’s Stack-Edu, as future work. Haven’t read the paper yet? Check it out! Paper: arxiv.org/abs/2505.02881

thumb_up_off_alt13

chat_bubble_outline1

repeat9

shareShare

Naoaki Okazaki

@chokkanorg

2 months ago

Gemma-2-Llama Swallow 2B, 9B, 27Bを公開しました。各規模において、日本語の理解・生成・対話でトップクラスの性能ですので、ぜひご活用頂ければと思います。なお、モデル学習の計算資源として、GoogleからTPU Research Cloud (TRC) のご支援を受けました。 swallow-llm.github.io/gemma2-llama-s…

thumb_up_off_alt333

chat_bubble_outline1

repeat100

shareShare

Lisan al Gaib

@scaling01

2 months ago

what a joke this was - AI doomers in shambles

thumb_up_off_alt147

chat_bubble_outline13

repeat4

shareShare

Kazuki Fujii

@okoge_kaz

2 months ago

かつてメンテナンスしていた llm-recipes: github.com/okoge-kaz/llm-… の開発 & メンテナンスの辛さなどをまとめた技術ブログの需要ってあるのですかね...？もっと役立つ記事のほうが需要あるのかなぁと思い、途中まで書いて放置しているのですが...

thumb_up_off_alt21

chat_bubble_outline2

repeat2

shareShare

Naoaki Okazaki

@chokkanorg

a month ago

Gemma-2-Llama SwallowがGoogle DeepMind社のGemmaverse（Gemma活用の事例集）で紹介されました。 deepmind.google/models/gemma/g…

thumb_up_off_alt44

chat_bubble_outline3

repeat10

shareShare

Kazuki Fujii

@okoge_kaz

a month ago

Llama-3.3-Swallow-70Bの学習にAWS Sagemaker HyperPodを利用させていただいた件がAWS公式のTechBlogになりました！ AWS Summit Japanでこちらについて、より詳しくお話しさせていただきます。 aws.amazon.com/jp/blogs/machi…

thumb_up_off_alt77

chat_bubble_outline3

repeat11

shareShare

Kazuki Fujii

@okoge_kaz

a month ago

Excited to share our latest achievement: training Llama 3.3 Swallow, a 70B-parameter Japanese sovereign LLM, leveraging AWS SageMaker HyperPod! The model outperforms leading models like GPT-4o-mini in Japanese tasks, showcasing significant advancements in language AI. Read

thumb_up_off_alt14

chat_bubble_outline0

repeat2

shareShare

chokudai(高橋直大)@AtCoder

@chokudai

a month ago

AtCoder Heuristic Contestを解くAI向けベンチマーク「ALE-Bench」を、Sakana AIと共同開発しました！ AtCoderのHeuristic部門は、Algorithm部門と比べて実践的な最適化の開発に近く、AIにこれが解けるようになればかなり役立つと思ってます！

thumb_up_off_alt315

chat_bubble_outline2

repeat71

shareShare

Satoshi Matsuoka

@profmatsuoka

a month ago

Please spread the word! We have a big conference center in the heart oh Osaka (Grand Cube), lots of rooms for workshops/tutorials, papers, invited tracks, BoFs, etc. There is a large exhibit hall, and the auditorium holds 2700 people for exciting keynotes every day.

thumb_up_off_alt10

chat_bubble_outline0

repeat4

shareShare