Elman Mansimov (@elmanmansimov) Twitter Tweets • TwiCopy

you's deep research report looks like the most compelling among the alternatives. finally something more engaging than tons of words. looking forward to trying it

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Elman Mansimov

@elmanmansimov

5 months ago

just tried chatgpt 4.5 it is indeed better at writing and feels more creative than gpt-4o yeah it might be not much better at benchmarks, but feels too early to arrive at a conclusion

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

but ultimately my conclusion after all LLM releases during last few weeks it is becoming exponentially harder to evaluate latest LLM's capabilities human attention span is flat or decreasing, while changes in model capabilities are getting more nuanced and less obvious

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Elman Mansimov

@elmanmansimov

5 months ago

after seeing a lot of game dev with Claude 3.7 sonnet and cursor tweets I realized that diffusion and generative models of videos is not the right way for AI generated video games better way is to have a great coding LLM model + 3D asset generation + animation

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Elman Mansimov

@elmanmansimov

5 months ago

looks promising!

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Elman Mansimov

@elmanmansimov

5 months ago

Thanks for the deep dive Monk Antony, it inspired me to take a stroll down memory lane! I started developing alignDRAW in May 2015 and submitted the paper on Nov 9th. Here are some early outputs on github from Sep 2015: github.com/mansimov/cap2i… It's cool to see Alec was

Thanks for the deep dive <a href="/monkantony_tez/">Monk Antony</a>, it inspired me to take a stroll down memory lane!

I started developing alignDRAW in May 2015 and submitted the paper on Nov 9th. Here are some early outputs on github from Sep 2015: github.com/mansimov/cap2i…

It's cool to see Alec was

thumb_up_off_alt34

chat_bubble_outline1

repeat6

shareShare

Elman Mansimov

@elmanmansimov

5 months ago

i keep hearing that with enough search budget (i.e. several different queries, retries, etc.) sparse retrieval (i.e. keyword / bm25 search) can outperform (or at least match) dense retrieval using embeddings in symbolic domains like code and text indeed if you think about it

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Elman Mansimov

@elmanmansimov

5 months ago

this looks like a big deal esp related to our understanding of what architectural tricks are important yo train neural nets every since batch norm and layer norm were released they became indispensable in the neural nets architectures making it much easier to stabilize training

thumb_up_off_alt9

chat_bubble_outline0

repeat0

shareShare

Elman Mansimov

@elmanmansimov

5 months ago

there is something inherently very fun with training single GPU sized models on small datasets on problems and outputs that motivate you once trained, model outputs look very fun and bring special excitement inside you. plus you get to understand tech even better.

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

FellowshipAI

@fellowshipai

4 months ago

How It Started and How It’s Going 🫴 alignDRAW (2015) chatGPT 4o (2025) Inspired by tokumei and his post on alignDRAW compared to ChatGPT, here are some fun comparisons between the two models. 📌 Prompt: A toilet seat sits open in the grass field ← alignDRAW (2015) | chatGPT

How It Started and How It’s Going 🫴
alignDRAW (2015) chatGPT 4o (2025)

Inspired by <a href="/tokumei/">tokumei</a> and his post on alignDRAW compared to ChatGPT, here are some fun comparisons between the two models.

📌 Prompt:
A toilet seat sits open in the grass field

← alignDRAW (2015) | chatGPT

thumb_up_off_alt39

chat_bubble_outline1

repeat11

shareShare

Sainbayar Sukhbaatar

@tesatory

4 months ago

Ten years ago in 2015 we published a paper called End-to-End Memory Networks (arxiv.org/abs/1503.08895). Looking back, this paper had many of the ingredients of current LLMs. Our model was the first language model that completely replaced RNN with attention. It had dot-product

thumb_up_off_alt595

chat_bubble_outline15

repeat116

shareShare

Arab Bank Switzerland

@arabbankch

4 months ago

🖼Art in the office - Take a virtual tour of our HQ in Geneva and discover the artworks on display! 📸An Airplane Flying Off Into The Distance On A Clear Day by Elman Mansimov

thumb_up_off_alt108

chat_bubble_outline14

repeat11

shareShare

Elman Mansimov

@elmanmansimov

4 months ago

I am attending ICLR in Singapore next week. Would love to meet new people and old friends. DM or email me to organize the meeting.

thumb_up_off_alt9

chat_bubble_outline0

repeat0

shareShare

Elman Mansimov

@elmanmansimov

a month ago

Cursor Agent and Claude has a tendency to over generate code for my tasks Reminding them to succinct and to the point like their life depends on it is a must

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Elman Mansimov

@elmanmansimov

a month ago

was locked out of my delta account for silly reasons had to call their representatives to book a flight — took almost 1 and a half hour we take digital world and internet for granted sometimes

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Elman Mansimov

@elmanmansimov

25 days ago

the better observation here is that multiple choice benchmark via ranking is not the right number to publish esp if your model only available via API we should release official numbers as we actually use the model (via generation) rather than ranking with likelihood

thumb_up_off_alt4

chat_bubble_outline1

repeat0

shareShare