Jan Feyereisl
@thefillm
Senior Research Scientist - GoodAI (@GoodAIdev) & Executive Director - AI Roadmap Institute (@AIroadmap)
ID: 23861430
https://www.goodai.com/ 12-03-2009 00:33:29
417 Tweet
757 Followers
3,3K Following
Check the 🔄Word Swap Challenge🔄: for just a bunch of cents, you can test an LLM and predict how good it will be at understanding your prompts. Apart from the great results, I really enjoyed collaborating with Jan Feyereisl on this work.
Fascinating new LLM benchmark alert! 🚨 My colleagues David Castillo and Jan Feyereisl have developed a simple yet powerful test exposing LLMs' struggles with sequential reasoning. Key findings: • Most LLMs start erring after just 2 operations 😮 • OpenAI o1-mini unsurprisingly