Ziming Liu (@zimingliu11) 's Twitter Profile
Ziming Liu

@zimingliu11

PhD student@MIT, AI for Physics/Science, Science of Intelligence & Interpretability for Science

ID: 1390673534033092608

linkhttps://kindxiaoming.github.io/ calendar_today07-05-2021 14:23:11

541 Tweet

11,11K Followers

746 Following

Neel Nanda (@neelnanda5) 's Twitter Profile Photo

It's a real shame that ICML has decided to automatically reject accepted papers if no author can attend ICML. A top conference paper is a significant boost to early career researchers, exactly the people least likely to be able to afford to go to a conference in Vancouver.

It's a real shame that ICML has decided to automatically reject accepted papers if no author can attend ICML. A top conference paper is a significant boost to early career researchers, exactly the people least likely to be able to afford to go to a conference in Vancouver.
Ziming Liu (@zimingliu11) 's Twitter Profile Photo

Superposition and neural scaling laws are the two amazing phenomena in language models. Our new work shows that they are the two sides of the same coin! In practice, one can control scaling by controlling superposition via a โ€œnegativeโ€ weight decay, which is kinda crazy :-)

Zhengzhong Tu (@_vztu) 's Twitter Profile Photo

๐Ÿš€ ๐—š๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ฎ๐˜๐—ถ๐˜ƒ๐—ฒ ๐—”๐—œ ๐—ถ๐˜€ ๐˜๐—ต๐—ฒ ๐—ก๐—ฒ๐˜…๐˜ ๐—ค๐˜‚๐—ฎ๐—ป๐˜๐˜‚๐—บ ๐—Ÿ๐—ฒ๐—ฎ๐—ฝ ๐—ณ๐—ผ๐—ฟ ๐—”๐˜‚๐˜๐—ผ๐—ป๐—ผ๐—บ๐—ผ๐˜‚๐˜€ ๐——๐—ฟ๐—ถ๐˜ƒ๐—ถ๐—ป๐—ดโ€”๐—›๐—ฒ๐—ฟ๐—ฒโ€™๐˜€ ๐˜๐—ต๐—ฒ ๐—›๐—ถ๐˜๐—ฐ๐—ต๐—ต๐—ถ๐—ธ๐—ฒ๐—ฟ'๐˜€ ๐—š๐˜‚๐—ถ๐—ฑ๐—ฒ ๐—ฌ๐—ผ๐˜‚โ€™๐˜ƒ๐—ฒ ๐—•๐—ฒ๐—ฒ๐—ป ๐—ช๐—ฎ๐—ถ๐˜๐—ถ๐—ป๐—ด ๐—™๐—ผ๐—ฟ! ๐Ÿš€ We're thrilled to share our most comprehensive, 128-page

๐Ÿš€ ๐—š๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ฎ๐˜๐—ถ๐˜ƒ๐—ฒ ๐—”๐—œ ๐—ถ๐˜€ ๐˜๐—ต๐—ฒ ๐—ก๐—ฒ๐˜…๐˜ ๐—ค๐˜‚๐—ฎ๐—ป๐˜๐˜‚๐—บ ๐—Ÿ๐—ฒ๐—ฎ๐—ฝ ๐—ณ๐—ผ๐—ฟ ๐—”๐˜‚๐˜๐—ผ๐—ป๐—ผ๐—บ๐—ผ๐˜‚๐˜€ ๐——๐—ฟ๐—ถ๐˜ƒ๐—ถ๐—ป๐—ดโ€”๐—›๐—ฒ๐—ฟ๐—ฒโ€™๐˜€ ๐˜๐—ต๐—ฒ ๐—›๐—ถ๐˜๐—ฐ๐—ต๐—ต๐—ถ๐—ธ๐—ฒ๐—ฟ'๐˜€ ๐—š๐˜‚๐—ถ๐—ฑ๐—ฒ ๐—ฌ๐—ผ๐˜‚โ€™๐˜ƒ๐—ฒ ๐—•๐—ฒ๐—ฒ๐—ป ๐—ช๐—ฎ๐—ถ๐˜๐—ถ๐—ป๐—ด ๐—™๐—ผ๐—ฟ! ๐Ÿš€

We're thrilled to share our most comprehensive, 128-page
Eric J. Michaud (@ericjmichaud_) 's Twitter Profile Photo

Today, the most competent AI systems in almost *any* domain (math, coding, etc.) are broadly knowledgeable across almost *every* domain. Does it have to be this way, or can we create truly narrow AI systems? In a new preprint, we explore some questions relevant to this goal...