Alex Cheema - e/acc (@alexocheema) 's Twitter Profile
Alex Cheema - e/acc

@alexocheema

Building @exolabs | prev @UniOfOxford We're hiring: exolabs.net

ID: 915614943797551104

linkhttps://github.com/exo-explore/exo calendar_today04-10-2017 16:29:48

4,4K Tweet

36,36K Followers

2,2K Following

Matt Beton (@mattbeton) 's Twitter Profile Photo

are there reasoning models that are better at big-picture thinking/planning, versus lower-level implementation? i’m wondering whether a hybrid strategy of models would be optimal; my workflow at the moment is to plan and theorise with o3, then cursor+claude 4 for implementation

Eric Buess (@ericbuess) 's Twitter Profile Photo

I remember Alex Cheema - e/acc's viral video of a small Mac Mini cluster: "Nemotron 70B at 8 tok/sec and scales to Llama 405B". I requested benchmarks, discovered how awesome Alex is via Zoom, and built the initial stages of benchmarks.exolabs.net Now Exo v2 launch incoming!

I remember <a href="/alexocheema/">Alex Cheema - e/acc</a>'s viral video of a small Mac Mini cluster: "Nemotron 70B at 8 tok/sec and scales to Llama 405B". I requested benchmarks, discovered how awesome Alex is via Zoom, and built the initial stages of benchmarks.exolabs.net

Now Exo v2 launch incoming!
Matt Beton (@mattbeton) 's Twitter Profile Photo

it was a pleasure to be asked to give a talk on our paper ā€˜SPARTA’ at ICLR 2025. distributed training isn’t a fantasy any more; with algorithmic improvements like this, training models over low-bandwidth environments becomes a reality read the paper here: openreview.net/forum?id=stFPf…

Alex Cheema - e/acc (@alexocheema) 's Twitter Profile Photo

great things come from cold dms Naval cold dm’d me and invested in EXO Labs hired Matt Beton after i cold dm’d him our first customer came from a cold dm you can just do things

Tycho van der Ouderaa (@tychovdo) 's Twitter Profile Photo

Thrilled to share that I’ve started my new role as a Senior Engineer at Qualcomm Research in Amsterdam. I’ll be joining the Model Efficiency team, where I’ll continue research on quantization and compression techniques for machine learning and AI. Qualcomm Qualcomm Research & Technologies

Thrilled to share that I’ve started my new role as a Senior Engineer at Qualcomm Research in Amsterdam. I’ll be joining the Model Efficiency team, where I’ll continue research on quantization and compression techniques for machine learning and AI. <a href="/Qualcomm/">Qualcomm</a> <a href="/QCOMResearch/">Qualcomm Research & Technologies</a>
Alex Cheema - e/acc (@alexocheema) 's Twitter Profile Photo

We’re already doing this with EXO Labs Last month was the first trial, we provided free M-chip public cloud access to developers at a hackathon. These were M3 Max/Ultra Mac Studios with up to 512GB unified memory. Awni Hannun gave a talk at the hackathon on how to leverage MLX

We’re already doing this with <a href="/exolabs/">EXO Labs</a>

Last month was the first trial, we provided free M-chip public cloud access to developers at a hackathon. These were M3 Max/Ultra Mac Studios with up to 512GB unified memory.

<a href="/awnihannun/">Awni Hannun</a> gave a talk at the hackathon on how to leverage MLX
Alex Cheema - e/acc (@alexocheema) 's Twitter Profile Photo

pump is one of the fastest growing startup ever. 0 to $1B ARR in 9 months. 25% of revenue for $PUMP buy backs is insane, i'm predicting this ends up in the top 10.

Josh Lavorini (@jrlavorini) 's Twitter Profile Photo

if they ever tell my story, let them say I walked with giants; men rise and fall like the winter wheat, but these names will never die.

Alex Cheema - e/acc (@alexocheema) 's Twitter Profile Photo

A new approach to efficient large scale distributed training on Apple Silicon. Most AI research today is focused on traditional GPUs. These GPUs have a LOT of FLOPS but not much memory. They have a low memory:flops ratio. Apple Silicon has a lot more memory available for the GPU

A new approach to efficient large scale distributed training on Apple Silicon.

Most AI research today is focused on traditional GPUs. These GPUs have a LOT of FLOPS but not much memory. They have a low memory:flops ratio. Apple Silicon has a lot more memory available for the GPU