Mikhail Parakhin (@mparakhin) Twitter Tweets • TwiCopy

Mikhail Parakhin

3 months ago

Anthropic models in C++ consistently forget an opening angle bracket '<' in complicated multi-line templates: T_MyTemplate < //missing here P_Param1, P_Param2 > I suspect this is due to interference with their internal tag system. I really hope everyone adopts Harmony...

thumb_up_off_alt55

chat_bubble_outline0

repeat2

shareShare

Mikhail Parakhin

@mparakhin

3 months ago

Ethan is a friend, but I think the opposite: OpenAI was sitting on strawberry for way too long, because of the inference GPU availability concerns, giving others time to catch up.

thumb_up_off_alt356

chat_bubble_outline10

repeat14

shareShare

Andrej Karpathy

@karpathy

3 months ago

Bit silly but I still watch the Apple event livestream for new iPhones, every year since the first one in 2007. It doesn't make sense but it's ok. Livestream today at 10am (in 1.5 hours). This year, crossing my fingers again for an iPhone mini that I know won't come. rip.

thumb_up_off_alt6,6K

chat_bubble_outline500

repeat207

shareShare

Mikhail Parakhin

@mparakhin

3 months ago

+1 for iPhone Mini! I already lug my laptop around for the big screen work.

thumb_up_off_alt37

chat_bubble_outline2

repeat1

shareShare

Mikhail Parakhin

@mparakhin

2 months ago

LLMs are mostly trained on texts — as in literature. Despite what we say, we don't value conciseness in essays (otherwise TLDRs would never be needed). The models transfer the same 'flowery eloquence' requirement to code generation, where it's the opposite of what we want.

thumb_up_off_alt105

chat_bubble_outline10

repeat4

shareShare

Mikhail Parakhin

@mparakhin

2 months ago

Yesterday one of our engineers was complaining about React: “It’s bad for LLMs, I have to keep saying “make it simpler” before the results are acceptable”. I just smiled…

thumb_up_off_alt30

chat_bubble_outline1

repeat0

shareShare

Mikhail Parakhin

@mparakhin

2 months ago

You'd be surprised. We are cooking at Toloka - hope to share something soon.

thumb_up_off_alt349

chat_bubble_outline7

repeat9

shareShare

Mikhail Parakhin

@mparakhin

2 months ago

I said that before and I stand by it: Liquid.AI has the best architecture for small models, unbeatable for their size.

thumb_up_off_alt96

chat_bubble_outline1

repeat5

shareShare

Mikhail Parakhin

@mparakhin

2 months ago

My video generation test is to create the famous episode of footage from Gibson’s Pattern Recognition. Sora 2 does a better job than Veo 3, but still not there. Fun fact: Sam is a big Neuromancer fan, I tried to convince him to call the model Wintermute once. He just laughed :-)

thumb_up_off_alt40

chat_bubble_outline2

repeat0

shareShare

Mikhail Parakhin

@mparakhin

2 months ago

We got a lot of mileage out of this paper. In retrospect, it is kind of an obvious wrapper around the Straight-Through Estimator, but very effective. Back in Sydney days Yuan Yu and us had to build manual hierarchical discretization strategies, here they appear automagically.

thumb_up_off_alt29

chat_bubble_outline0

repeat0

shareShare

Mikhail Parakhin

@mparakhin

2 months ago

Laugh all you want, but NTFS file streams (ADS) are tailor-made for model distribution. On Linux all the supplementary information has to be in a separate file, they inevitably desynchronize. On Windows I always keep model.pt and model.pt:extra_info

thumb_up_off_alt18

chat_bubble_outline0

repeat0

shareShare

Mikhail Parakhin

@mparakhin

2 months ago

Whodathunk... :-)

thumb_up_off_alt46

chat_bubble_outline2

repeat0

shareShare

Mikhail Parakhin

@mparakhin

a month ago

It appears Google has slightly limited the thinking budget for DeepThink, I have to think for myself more often again :-(. On the positive side, Demis told me last week that I am "going to be very impressed by Gemini 3" - can't wait!

thumb_up_off_alt749

chat_bubble_outline23

repeat20

shareShare

Mikhail Parakhin

@mparakhin

a month ago

Prediction: we will see more passion projects like this. LLMs make it much easier to implement, so, I expect someone to make OS/2 and IRIX work on modern hardware.

thumb_up_off_alt27

chat_bubble_outline1

repeat0

shareShare

Mikhail Parakhin

@mparakhin

a month ago

EmEditor is in harvesting mode and has become unusable. Which editor for huge text files should I switch to on Windows - UltraEdit? 010? Loxx?

thumb_up_off_alt5

chat_bubble_outline1

repeat0

shareShare