
Michał Bortkiewicz @ICLR
@m_bortkiewicz
PhD at Warsaw University of Technology
Working on RL and Continual Learning
ID: 1357769711963021313
https://michalbortkiewicz.github.io/ 05-02-2021 19:15:21
81 Tweet
134 Followers
390 Following


🔥 New ICLR 2025 Paper! It would be cool to control the content of text generated by diffusion models with less than 1% of parameters, right? And how about doing it across diverse architectures and within various applications? 🚀 🫡 Together with Lukasz Staniszewski, we show how: 🧵 1/

1/ While most RL methods use shallow MLPs (~2–5 layers), we show that scaling up to 1000-layers for contrastive RL (CRL) can significantly boost performance, ranging from doubling performance to 50x on a diverse suite of robotic tasks. Webpage+Paper+Code: wang-kevin3290.github.io/scaling-crl/

Check out this new paper by Kevin Wang, myself, Michał Bortkiewicz, Tomasz Trzcinski, and Ben Eysenbach! We show a method for scaling Contrastive RL, leading to significant performance improvements.

🚨Scaling RL Most RL methods’ performance saturate at ~5 layers. In this work led by Kevin Wang, we crack the right configuration for scaling Contrastive RL and go beyond 1000 layers NNs! Deep NNs unlock emergent behaviors and other cool properties. Check out Kevin’s thread!


Instytut Ideas może zacząć działać - wczoraj dokonany został wpis do KRSu. Jednakże wśród dobrych informacji są też takie dużo bardziej przejmujące o chorobie jednego z liderów zespołów, który miał niedługo zacząć pracę. Łukasz Kuciński walczy z glejakiem i szuka wsparcia na:


Lukasz Łukasz Kuciński is not only an excellent researcher, but a truely great person. Kind, thoughtful and wise. And all that should be enough to support him in his fight against cancer. But he is also a father and a husband with a loving family worth fighting for. Support him 🙏




New paper: Deceptive LLMs may keep secrets from their operators. Can we elicit this latent knowledge? Maybe! Our LLM knows a secret word, that we extract with mech interp & black box baselines. We open source our model, how much better can you do? w/Emil Ryd Senthooran Rajamanoharan Neel Nanda






