λndres Mariscal (@serialdev) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Kostas Anagnostou

@kostasaaa

3 months ago

Worth a read: "How do I become a graphics programmer? - A small guide from the AMD Game Engineering team": gpuopen.com/learn/how_do_y…

thumb_up_off_alt581

chat_bubble_outline5

repeat98

shareShare

I implemented an LLM end-to-end in hardware, and ran it on an FPGA. Zero Python. Zero CUDA. Just pure SysVerilog. All my progress + everything I learned from 200h of LLM chip design (demo at the end)👇

thumb_up_off_alt2,2K

chat_bubble_outline87

repeat276

shareShare

Ray

@raysan5

3 months ago

raylib running on emacs... on emacs lisp! 🤯

thumb_up_off_alt264

chat_bubble_outline6

repeat19

shareShare

Qwen

@alibaba_qwen

3 months ago

Please check out our Qwen3 Technical Report. 👇🏻 github.com/QwenLM/Qwen3/b…

thumb_up_off_alt1,1K

chat_bubble_outline30

repeat297

shareShare

Deedy

@deedydas

2 months ago

DeepSeek just dropped the single best end-to-end paper on large model training. It covers — Software (MLA, training in FP8, DeepEP, LogFMT) — Hardware (Multi-Rail Fat Tree, Ethernet RoCE switches) — Mix (IBGDA, 3FS filesystem) DeepSeek's engineering depth is insane. Must read.

thumb_up_off_alt4,4K

chat_bubble_outline42

repeat708

shareShare

Nic Fishman

@njwfish

2 months ago

🚨 New preprint 🚨 We introduce Generative Distribution Embeddings (GDEs) — a framework for learning representations of distributions, not just datapoints. GDEs enable multiscale modeling and come with elegant statistical theory and some miraculous geometric results! 🧵

thumb_up_off_alt770

chat_bubble_outline5

repeat140

shareShare

Sebastian Aaltonen

@sebaaltonen

2 months ago

And, we are back in triangles for neural graphics :D First it was just a neural network (NeRF), then sparse octree (NeRF converted to octree), then gaussian splats (basically particle splatting) and now triangles :)

thumb_up_off_alt393

chat_bubble_outline16

repeat36

shareShare

Emmanuel Ameisen

@mlpowered

2 months ago

The methods we used to trace the thoughts of Claude are now open to the public! Today, we are releasing a library which lets anyone generate graphs which show the internal reasoning steps a model used to arrive at an answer.

thumb_up_off_alt1,1K

chat_bubble_outline21

repeat181

shareShare

Phil Eaton

@eatonphil

2 months ago

chat pointed out that facebook forked jemalloc a few weeks ago so maybe that's where development is now going

thumb_up_off_alt111

chat_bubble_outline2

repeat6

shareShare

Stella Biderman

@blancheminerva

2 months ago

Two years in the making, we finally have 8 TB of openly licensed data with document-level metadata for authorship attribution, licensing details, links to original copies, and more. Hugely proud of the entire team.

thumb_up_off_alt551

chat_bubble_outline18

repeat64

shareShare

Daniel Lemire

@lemire

2 months ago

The laws of Physics are not what is limiting the performance of Microsoft Teams.

thumb_up_off_alt426

chat_bubble_outline6

repeat13

shareShare

Luca Ambrogioni

@lucaamb

2 months ago

1/2) It's finally out on Arxiv: Feedback guidance of generative diffusion models! We derived an adaptive guidance methods from first principles that regulate the amount of guidance based on its current state. Complex prompts are highly guided while simplem ones are almost free

thumb_up_off_alt443

chat_bubble_outline3

repeat71

shareShare

λndres Mariscal

@serialdev

2 months ago

So it looks like today people are discovering Shaders

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Geoff Langdale

@geofflangdale

2 months ago

I'm working on a good heuristic to put would-be tech influencers on mute/block (depending on obnoxiousness). Current idea is >10K (5K?) followers without any discernable achievements, or some scaled version of same for people with *some* achievements but who have clearly ...

thumb_up_off_alt57

chat_bubble_outline10

repeat6

shareShare

Stella Biderman

@blancheminerva

a month ago

Take the LLaMA 3 paper for another example. I know (from personal experience and talking to others) that many authors of this paper endorse the above view. And yet, not a single model in their scaling laws plots is that large! (7B / 1T = 4.2e22 FLOP)

thumb_up_off_alt21

chat_bubble_outline3

repeat1

shareShare

Jebrim

@agilejebrim

a month ago

Still looking for my next gig. Open for contracting opportunities as well. Reach out in DMs if you have something.

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Annatar

@annatarthemaia

22 days ago

This is THE laptop on which Sun Microsystems' kernel engineers developed Solaris 10.

thumb_up_off_alt6

chat_bubble_outline1

repeat1

shareShare

NotebookLM

@notebooklm

4 days ago

This year’s summer blockbuster is coming soon to a computer screen near you… 👀 👀 👀

thumb_up_off_alt2,2K

chat_bubble_outline69

repeat246

shareShare

Better Software Conference

@bettersoftwarec

3 days ago

.Sam H Smith's talk is out! Come delve into AST-free compilation and optimizing with sea of nodes! youtu.be/NxiKlnUtyio

thumb_up_off_alt133

chat_bubble_outline1

repeat23

shareShare

Casper Hansen

@casper_hansen_

2 days ago

Ex-DeepSeek author of Native Sparse Attention won best paper award of ACL. I was lucky enough to attend a live lecture where he revealed: - scaling up context length to 1 million - this will be in the next frontier model There is good reason to believe DeepSeek V4 will use NSA.

thumb_up_off_alt501

chat_bubble_outline13

repeat52

shareShare

λndres Mariscal

Gate.io

Kostas Anagnostou

Pavan Jayasinha

Ray

Qwen

Deedy

Nic Fishman

Sebastian Aaltonen

Emmanuel Ameisen

Phil Eaton

Stella Biderman

Daniel Lemire

Luca Ambrogioni

λndres Mariscal

Geoff Langdale

Stella Biderman

Jebrim

Annatar

NotebookLM

Better Software Conference

Casper Hansen