Yuxiang (Jimmy) Wu (@yuxiangjwu) Twitter Tweets • TwiCopy

Yuandong Tian

a year ago

Great to hear that OpenAI has confirmed that the startup Weco AI, co-founded by my former intern Zhengyao Jiang, has the best Machine Learning Engineer Agent in the world😀in their MLE-Bench. Congrats!

thumb_up_off_alt66

chat_bubble_outline0

repeat3

shareShare

Sebastian Riedel (@[email protected])

@riedelcastro

a year ago

Amazing progress Yuxiang (Jimmy) Wu and Zhengyao Jiang, and great to see the impact of "agent scaffolding" given a base model.

thumb_up_off_alt15

chat_bubble_outline0

repeat4

shareShare

Yuxiang (Jimmy) Wu

@yuxiangjwu

10 months ago

We’re hiring a full-time Frontend Engineer! Join us to build next-gen apps that bring AI to life through natural language. If you’re passionate about product and AI, apply now! #hiring #frontend #AI linkedin.com/jobs/view/4069…

thumb_up_off_alt11

chat_bubble_outline1

repeat4

shareShare

Laura Ruis

@lauraruis

9 months ago

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this: Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢 🧵⬇️

thumb_up_off_alt966

chat_bubble_outline24

repeat208

shareShare

METR

@metr_evals

9 months ago

How close are current AI agents to automating AI R&D? Our new ML research engineering benchmark (RE-Bench) addresses this question by directly comparing frontier models such as Claude 3.5 Sonnet and o1-preview with 50+ human experts on 7 challenging research engineering tasks.

thumb_up_off_alt849

chat_bubble_outline15

repeat172

shareShare

Zhengyao Jiang

@zhengyaojiang

9 months ago

AIDE was built for tabular Machine Learning and optimized for GPT-4. It surprised me by generalizing to new models (o1) & deep learning tasks by OpenAI MLE-Bench. RE-Bench now shows it scaling to cutting-edge AI research, this is mind-blowing!

AIDE was built for tabular Machine Learning and optimized for GPT-4.

It surprised me by generalizing to new models (o1) & deep learning tasks by <a href="/OpenAI/">OpenAI</a> MLE-Bench.

RE-Bench now shows it scaling to cutting-edge AI research, this is mind-blowing!

thumb_up_off_alt41

chat_bubble_outline2

repeat11

shareShare

Tim Rocktäschel

@_rockt

9 months ago

AIDE by Weco AI's Dominik Schmidt Zhengyao Jiang and Yuxiang (Jimmy) Wu strikes again!

thumb_up_off_alt13

chat_bubble_outline0

repeat7

shareShare

Sohee Yang

@soheeyang_

9 months ago

🚨 New Paper 🚨 Can LLMs perform latent multi-hop reasoning without exploiting shortcuts? We find the answer is yes – they can recall and compose facts not seen together in training or guessing the answer, but success greatly depends on the type of the bridge entity (80%+ for

thumb_up_off_alt192

chat_bubble_outline7

repeat46

shareShare

Yuxiang (Jimmy) Wu

@yuxiangjwu

9 months ago

The state-of-the-art MLE agent AIDE now has a local Web UI. Connect it with your favourite LLM and start playing now!

thumb_up_off_alt15

chat_bubble_outline1

repeat2

shareShare

Jiao Sun

@sunjiao123sun_

9 months ago

Mitigating racial bias from LLMs is a lot easier than removing it from humans! Can’t believe this happened at the best AI conference NeurIPS Conference We have ethical reviews for authors, but missed it for invited speakers? 😡

Mitigating racial bias from LLMs is a lot easier than removing it from humans!

Can’t believe this happened at the best AI conference <a href="/NeurIPSConf/">NeurIPS Conference</a>

We have ethical reviews for authors, but missed it for invited speakers? 😡

thumb_up_off_alt3,3K

chat_bubble_outline184

repeat837

shareShare

Yuxiang (Jimmy) Wu

@yuxiangjwu

8 months ago

Over the past few months, the Weco AI team has been hard at work building Weco AI Functions—a platform that simplifies adding and optimizing AI features with just a function call. My favorite part? Effortless A/B testing and versioning. You can compare multiple LLMs

Over the past few months, the <a href="/WecoAI/">Weco AI</a> team has been hard at work building Weco AI Functions—a platform that simplifies adding and optimizing AI features with just a function call.

My favorite part? Effortless A/B testing and versioning. You can compare multiple LLMs

thumb_up_off_alt22

chat_bubble_outline0

repeat7

shareShare

Yuandong Tian

@tydsh

8 months ago

Nice experience😀. Define a function with natural language, and the function call is available to you immediately anywhere. "What you think immediately becomes what you get" 🚀🚀

thumb_up_off_alt23

chat_bubble_outline0

repeat4

shareShare

Machine Learning Street Talk

@mlstreettalk

7 months ago

We spoke with Laura Ruis from Cohere For AI and UCL about her paper "Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models" where she demonstrated an interesting gap between retrieval and reasoning queries in LLMs indicating the presence of synthesised

thumb_up_off_alt150

chat_bubble_outline2

repeat26

shareShare

Zhengyao Jiang

@zhengyaojiang

6 months ago

The entire internet era has been about transmitting information. AI will take computer science to the next level, by generating new knowledge through trial and error. LLMs following the scientific methodology can already automate R&D. Check the paper of AIDE!

thumb_up_off_alt27

chat_bubble_outline0

repeat8

shareShare

Yuxiang (Jimmy) Wu

@yuxiangjwu

6 months ago

I used to spend weeks in trial-and-error loops building deep learning models, until we built AIDE to handle that work for us. Now I can tackle more than 20 ML problems at once and train 1,000+ models in parallel. It’s incredibly empowering! See how we’re rethinking machine

thumb_up_off_alt38

chat_bubble_outline1

repeat14

shareShare

Yuxiang (Jimmy) Wu

@yuxiangjwu

4 months ago

If you are at ICLR and interested in AI-driven scientific discovery, make sure to chat with Zhengyao Jiang

thumb_up_off_alt9

chat_bubble_outline0

repeat2

shareShare

Andrej Karpathy

@karpathy

2 months ago

Love this project: nanoGPT -> recursive self-improvement benchmark. Good old nanoGPT keeps on giving and surprising :) - First I wrote it as a small little repo to teach people the basics of training GPTs. - Then it became a target and baseline for my port to direct C/CUDA

thumb_up_off_alt4,4K

chat_bubble_outline91

repeat668

shareShare

Yuxiang (Jimmy) Wu

@yuxiangjwu

2 months ago

Thrilled to see Weco AI's AIDE used in Meta's work and big congrats to Minqi Jiang Bingchen Zhao Despoina Magka. It's a truly exciting time to work on Recursive Self-Improvement.

thumb_up_off_alt16

chat_bubble_outline2

repeat4

shareShare

Yuxiang (Jimmy) Wu

@yuxiangjwu

a month ago

Seed secured, now the real experiments begin Weco AI

thumb_up_off_alt23

chat_bubble_outline1

repeat1

shareShare