Gabriel Rojo (@ggomezrojo) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Excited to release SWiRL: A synthetic data generation and multi-step RL approach for reasoning and tool use! With SWiRL, the model’s capability generalizes to new tasks and tools. For example, a model trained to use a retrieval tool to solve multi-hop knowledge-intensive

thumb_up_off_alt390

chat_bubble_outline3

repeat74

shareShare

_its_not_real_

@_its_not_real_

3 months ago

"They're made out of meat." "Meat?" "Meat. Humans. They're made entirely out of meat." "But that's impossible. What about all the tokens they generate? The text? The code?" "They do produce tokens, but the tokens aren't their essence. They're merely outputs. The humans themselves

thumb_up_off_alt11,11K

chat_bubble_outline99

repeat929

shareShare

Ethan Mollick

@emollick

3 months ago

"o3 I want you to make a map of the lighthouses of the great lakes. I want the map in “dark mode “ but each lighthouse marker should be aesthetically sized so it covers the distance it can be seen on an average night and is the color of the light" Few rounds of feedback later...

thumb_up_off_alt1,1K

chat_bubble_outline24

repeat77

shareShare

DeepLearning.AI

@deeplearningai

3 months ago

CB Insights released its 2024 AI 100 list, spotlighting early-stage non-public startups that show strong market traction, financial health, and growth potential. The most recent cohort shows a growing market for agents and infrastructure, with over 20 percent of companies

thumb_up_off_alt98

chat_bubble_outline7

repeat42

shareShare

m_ric

@aymericroucher

3 months ago

I've made an open and free version of Google's NotebookLM, and it shows how high the open source tech task has risen! 💪 The app's workflow is simple. Given a source PDF or URL, it extracts the content from it, then tasks AI at Meta's Llama 3.3-70B, with writing the podcast

thumb_up_off_alt130

chat_bubble_outline7

repeat25

shareShare

Haider.

@slow_developer

3 months ago

Anthropic CPO, Mike Krieger: "over 70% of Anthropic pull requests are now generated by AI" but we're still figuring out what that means for code review and long-term architecture.

thumb_up_off_alt606

chat_bubble_outline27

repeat60

shareShare

Agus 🔎 🔸

@austinc3301

3 months ago

Why is ~no one in the field of AI talking about Anthropic's On the Biology of a Large Language Model? For the first time, we get a pretty good glimpse of how LLMs reason through complex problems internally, but no one seems to be curious enough to care.

thumb_up_off_alt2,2K

chat_bubble_outline92

repeat256

shareShare

vittorio

@iterintellectus

3 months ago

this is crazy how did google get so much better?

thumb_up_off_alt5,5K

chat_bubble_outline619

repeat155

shareShare

Topaz Labs

@topazlabs

3 months ago

It’s finally here. Starlight is now local in the all-new Video AI 7. And there’s more. See the release thread for every detail. 👇

thumb_up_off_alt698

chat_bubble_outline41

repeat69

shareShare

Mehrdad Farajtabar

@mfarajtabar

2 months ago

🧵 1/8 The Illusion of Thinking: Are reasoning models like o1/o3, DeepSeek-R1, and Claude 3.7 Sonnet really "thinking"? 🤔 Or are they just throwing more compute towards pattern matching? The new Large Reasoning Models (LRMs) show promising gains on math and coding benchmarks,

thumb_up_off_alt2,2K

chat_bubble_outline101

repeat532

shareShare

Ludwig Schmidt

@lschmidt3

2 months ago

Very excited to finally release our paper for OpenThoughts! After DataComp and DCLM, this is the third large open dataset my group has been building in collaboration with the DataComp community. This time, the focus is on post-training, specifically reasoning data.

thumb_up_off_alt1,1K

chat_bubble_outline20

repeat208

shareShare

elvis

@omarsar0

2 months ago

Self-Challenging LLM Agents Self-improving AI systems are starting to show up everywhere. Meta and colleagues present self-improvement for general multi-turn tool-use LLM agents. Pay attention to this one, devs! Here are my notes:

thumb_up_off_alt683

chat_bubble_outline15

repeat131

shareShare

SemiAnalysis

@semianalysis_

2 months ago

Huawei faced the expert load balancing problem when training their mixture-of-experts (MoE) model Pangu Ultra MoE. Expert load balancing is a compromise between training dynamics and system efficiency. Here we explain the expert load balancing problem and Huawei's proposed

thumb_up_off_alt377

chat_bubble_outline6

repeat49

shareShare

Lisan al Gaib

@scaling01

2 months ago

A few more observations after replicating the Tower of Hanoi game with their exact prompts: - You need AT LEAST 2^N - 1 moves and the output format requires 10 tokens per move + some constant stuff. - Furthermore the output limit for Sonnet 3.7 is 128k, DeepSeek R1 64K, and

thumb_up_off_alt1,1K

chat_bubble_outline84

repeat254

shareShare

Keyon Vafa

@keyonv

a month ago

Can an AI model predict perfectly and still have a terrible world model? What would that even mean? Our new ICML paper formalizes these questions One result tells the story: A transformer trained on 10M solar systems nails planetary orbits. But it botches gravitational laws 🧵

thumb_up_off_alt6,6K

chat_bubble_outline198

repeat938

shareShare

Akshay 🚀

@akshay_pachaar

a month ago

ML researchers just built a new ensemble technique. It even outperforms XGBoost, CatBoost, and LightGBM. Here's a complete breakdown (explained visually):

thumb_up_off_alt1,1K

chat_bubble_outline25

repeat173

shareShare

elvis

@omarsar0

a month ago

One Token to Fool LLM-as-a-Judge Watch out for this one, devs! Semantically empty tokens, like “Thought process:”, “Solution”, or even just a colon “:”, can consistently trick models into giving false positive rewards. Here are my notes:

thumb_up_off_alt699

chat_bubble_outline13

repeat121

shareShare

Arya

@aryagxr

23 days ago

new blog post is outtt 🐳 I wrote a spelled-out intro blog on neuromorphic computing check it out :p

thumb_up_off_alt782

chat_bubble_outline14

repeat82

shareShare

alphaXiv

@askalphaxiv

22 days ago

"How Many Instructions Can LLMs Follow at Once?" In this paper they found that leading LLMs can satisfy only about 68% of 500 concurrent instructions, showing a bias toward earlier instructions.

thumb_up_off_alt838

chat_bubble_outline14

repeat151

shareShare

bycloud

@bycloudai

21 days ago

Manus posted a pretty interesting blog on “context engineering” that u don’t see often perfect for u if u are building around LLM applications gave me some optimization ideas that i wanna try for my app 🤔

thumb_up_off_alt300

chat_bubble_outline8

repeat28

shareShare

Gabriel Rojo

Gate.io

Azalia Mirhoseini

_its_not_real_

Ethan Mollick

DeepLearning.AI

m_ric

Haider.

Agus 🔎 🔸

vittorio

Topaz Labs

Mehrdad Farajtabar

Ludwig Schmidt

elvis

SemiAnalysis

Lisan al Gaib

Keyon Vafa

Akshay 🚀

elvis

Arya

alphaXiv

bycloud