mgostIH (@mgostih) 's Twitter Profile
mgostIH

@mgostih

What if Computer Science and Math had a baby? šŸ˜‡

ID: 796416741765214208

linkhttps://www.youtube.com/channel/UCZV-Iin4pXh27VD_YyM2mQQ calendar_today09-11-2016 18:18:42

1,1K Tweet

4,4K Followers

382 Following

Roman Gaditskii (@gadirom_) 's Twitter Profile Photo

SGD vs AdamšŸ’” I punish x10 for distance to mouse in the loss function, but Adam's gradient normalisation eliminates the effect. #MLX #SwiftUI #MachineLearning

mgostIH (@mgostih) 's Twitter Profile Photo

The discourse over the Apple paper about how LLMs can't reason still shows that there is an unsatiable thirst over negativity in anything deep learning related. You can get tens of thousand of likes claiming bullshit in a game of telephone because people want to believe in it.

mgostIH (@mgostih) 's Twitter Profile Photo

A lot of linear RNNs and test time training models are now using forgetting gates, but we don't really do that with standard optimizers like Adam or Muon yet..

mgostIH (@mgostih) 's Twitter Profile Photo

My brain is simulating you rn, but cloth simulations are computationally challenging so we'll have to make do without your clothes in it

mgostIH (@mgostih) 's Twitter Profile Photo

Played "Peak" with friends, it was nice, but there's a complete lack of pausing the game in any way, even singleplayer offline. Given that good runs last hours and wasting time penalizes you, it's a big issue. Don't recommend until they fix this.

Vlado Boza (@bozavlado) 's Twitter Profile Photo

Empirical Fisher (aka estimating Hessian using training labels instead of sampling from model outputs) is one of the biggest scams surrounding neural networks.

Empirical Fisher (aka estimating Hessian using training labels instead of sampling from model outputs) is one of the biggest scams surrounding neural networks.