Alex Strick van Linschoten (@strickvl) 's Twitter Profile
Alex Strick van Linschoten

@strickvl

ML Engineer (@zenml_io), researcher (& author of a few books). 🐘: @[email protected] and 🦋: @strickvl.bsky.social. Created geminibyexample.com

ID: 244511094

linkhttps://mlops.systems calendar_today29-01-2011 13:38:59

2,2K Tweet

2,2K Followers

223 Following

Alex Strick van Linschoten (@strickvl) 's Twitter Profile Photo

Mapped out naming conventions across LLM tracing tools and honestly... I wish they'd all just stick to "Trace → Span" 😅 While most follow OpenTelemetry patterns, several providers went creative: Helicone uses "Session → Request", HoneyHive has "Session → Event", and tools

Mapped out naming conventions across LLM tracing tools and honestly... I wish they'd all just stick to "Trace → Span" 😅

While most follow OpenTelemetry patterns, several providers went creative: Helicone uses "Session → Request", HoneyHive has "Session → Event", and tools
Nicholas Noe (@noenicholas) 's Twitter Profile Photo

Hell on earth & other words. The dominant Israeli “Amalek” position is that much worse is deserved, needed & legal (tho professed belief in latter seems to have faded significantly among “soft” pro-Amalek backers). It’s increasingly likely that this desire will be realized soon.

xjdr (@_xjdr) 's Twitter Profile Photo

you should legally be required to disclose what quantization level you are serving your current model at like it was a nutrition label. you should also be banned from dynamically adjusting quantization based on demand without notification. (you know who you are ...)

Abu Omar (@talaatsyehia) 's Twitter Profile Photo

2 minutes of horror. Pure fucking horror. The drones created an invisible line of death separating children from their families. What trauma these people will carry for the rest of their lives if they survive this atrocity is unimaginable.

Hind Khoudary (@hind_gaza) 's Twitter Profile Photo

I honestly don’t know how I keep doing this every day. I’ve never felt this emotionally drained—not even at the beginning of the war. I’m dehydrated, barely sleeping, and my heart feels shattered into pieces. But you still hear me, no?

Daniel van Strien (@vanstriendaniel) 's Twitter Profile Photo

How to Make Gallery, Library, Archive & Museum Collections Ready for AI 📚✨ Join me this Tuesday – I'm showing cultural institutions how to share data on Hugging Face Hub in a hands-on session for the ai4lam community! 🗓️ June 17, 16:00 UK 🔗 Details: docs.google.com/document/d/1-f…

Nicholas Noe (@noenicholas) 's Twitter Profile Photo

The fetishism of these secret ops-impossible for journos to ever verify-is dangerous. It reinforces the illusion of total mastery & thus does a disservice to the public by corroding the inherent, well-founded concern most people have over launching wars, regime changes etc:

Hugo Bowne-Anderson (@hugobowne) 's Twitter Profile Photo

“If evals is just a metric, then you’re thinking about evals wrong. It’s not a metric, it’s a entire process.” Hamel Husain joined our Building with LLMs course to talk about why most teams get AI evaluation wrong, and what it actually takes to improve AI products. The hardest

Alec MacGillis (@alecmacgillis) 's Twitter Profile Photo

"She dreamed of seeing Coldplay live. She loved trying new foods and was learning Italian. She wrote poetry constantly and shared it w/ friends. She was so proud of having summited Iran’s highest peak, Mount Damavand, that she made sure to mention that fact to everyone she met."

"She dreamed of seeing Coldplay live. She loved trying new foods and was learning Italian. She wrote poetry constantly and shared it w/ friends. She was so proud of having summited Iran’s highest peak, Mount Damavand, that she made sure to mention that fact to everyone she met."
Alex Strick van Linschoten (@strickvl) 's Twitter Profile Photo

Valuable research for what it studies, but misses a crucial eval gap: upstream source quality. Only measuring final citation accuracy ignores how these agents actually gather information. From my experience testing research agents in domains where I used to be a world-class