Andrea Boscutti (@aboscutti) 's Twitter Profile
Andrea Boscutti

@aboscutti

ID: 1129888328915070976

calendar_today18-05-2019 23:15:44

45 Tweet

31 Followers

299 Following

Robbie Barrat (@videodrome) 's Twitter Profile Photo

I'm laughing so hard at this slide a friend sent me from one of Geoff Hinton's courses; "To deal with hyper-planes in a 14-dimensional space, visualize a 3-D space and say 'fourteen' to yourself very loudly. Everyone does it."

I'm laughing so hard at this slide a friend sent me from one of Geoff Hinton's courses;

"To deal with hyper-planes in a 14-dimensional space, visualize a 3-D space and say 'fourteen' to yourself very loudly. Everyone does it."
Yann LeCun (@ylecun) 's Twitter Profile Photo

cognito Convolution is equivariant to translations. Self-attention is equivariant to permutations. They both have a role to play. Conv is efficient for signals with strong local correlations and motifs that can appear anywhere. SelfAtt is good for "object-based" representations where

JAMA Psychiatry (@jamapsych) 's Twitter Profile Photo

Can machine learning uncover the multivariate neural signature of major depressive disorder in individual patients? This large-scale study including 1,801 patients and controls finds no robust multivariate depression markers. ja.ma/3S0QgFY

Yann LeCun (@ylecun) 's Twitter Profile Photo

Meta has always tried to do the Right Thing. Meta has always practiced open research in AI. Meta has been promoting open source AI platforms. After numerous discussions over the last year (sometimes contentious) a consensus is emerging that open source AI platforms are

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

# on shortification of "learning" There are a lot of videos on YouTube/TikTok etc. that give the appearance of education, but if you look closely they are really just entertainment. This is very convenient for everyone involved : the people watching enjoy thinking they are

Paul Graham (@paulg) 's Twitter Profile Photo

I just moved the ChatGPT tab over to the left end of my main browser window, where I keep the tabs of things I use all the time, like GMail and Google Calendar.

Yann LeCun (@ylecun) 's Twitter Profile Photo

* Language is low bandwidth: less than 12 bytes/second. A person can read 270 words/minutes, or 4.5 words/second, which is 12 bytes/s (assuming 2 bytes per token and 0.75 words per token). A modern LLM is typically trained with 1x10^13 two-byte tokens, which is 2x10^13 bytes.

Cognition (@cognition_labs) 's Twitter Profile Photo

Today we're excited to introduce Devin, the first AI software engineer. Devin is the new state-of-the-art on the SWE-Bench coding benchmark, has successfully passed practical engineering interviews from leading AI companies, and has even completed real jobs on Upwork. Devin is

Dan Roberts (@danintheory) 's Twitter Profile Photo

Do LLMs really need to be so L? That's a rejected title for a new paper w/ Andrey Gromov, Kushal Tirumala, Hassan Shapourian, Paolo Glorioso on pruning open-weight LLMs: we can remove up to *half* the layers of Llama-2 70B w/ essentially no impact on performance on QA benchmarks. 1/

Do LLMs really need to be so L? 

That's a rejected title for a new paper w/ <a href="/Andr3yGR/">Andrey Gromov</a>, <a href="/kushal_tirumala/">Kushal Tirumala</a>, <a href="/Hasan_Shap/">Hassan Shapourian</a>, <a href="/PaoloGlorioso1/">Paolo Glorioso</a> on pruning open-weight LLMs: we can remove up to *half* the layers of Llama-2 70B w/ essentially no impact on performance on QA benchmarks.

1/
Jayson Jeganathan (@jaysonjeg) 's Twitter Profile Photo

Do you use surface fMRI? We found spurious correlations in surface fMRI, with potentially serious implications for test-retest reliability, fingerprinting, functional parcellations and brain-behaviour associations (1/n) biorxiv.org/cgi/content/sh…

Divyansha (@divyansha1115) 's Twitter Profile Photo

Excited to share our Graph Foundation Model, 🌐 GraphFM, trained on 152 datasets with over 7.4 million nodes and 189 million edges spanning diverse domains. 🚨 Check out our preprint for GraphFM where we test how our model scales with data and model size, and show efficient

Jonathan Gorard (@getjonwithit) 's Twitter Profile Photo

Moths are attracted to lights because of the same mathematics that underlies twistor theory and compactification in theoretical physics: projective geometry. It all starts from a simple observation: translations are just rotations whose center is located "at infinity". (1/11)

Moths are attracted to lights because of the same mathematics that underlies twistor theory and compactification in theoretical physics: projective geometry.

It all starts from a simple observation: translations are just rotations whose center is located "at infinity". (1/11)
The Transmitter (@_thetransmitter) 's Twitter Profile Photo

With neuroscience datasets and scientific collaborations growing in size, Gaelle Chapuis and Olivier Winter explain why neuroscience needs to create a career path for software engineers. thetransmitter.org/craft-and-care…

Andy Keller (@t_andy_keller) 's Twitter Profile Photo

In the physical world, almost all information is transmitted through traveling waves -- why should it be any different in your neural network? Super excited to share recent work with the brilliant Mozes Jacobs: "Traveling Waves Integrate Spatial Information Through Time" 1/14