Alex Ratner (@ajratner) 's Twitter Profile
Alex Ratner

@ajratner

@SnorkelAI @uwcse / prev @StanfordAILab – Interested in data management systems for machine learning, weak supervision, and impactful applications.

ID: 2189702274

linkhttps://ajratner.github.io/ calendar_today12-11-2013 05:50:18

1,1K Tweet

5,5K Followers

580 Following

Albert Ge (@albert_ge_95) 's Twitter Profile Photo

Online data mixing reduces training costs for foundation models, but faces challenges: ⚠️ Human-defined domains miss semantic nuances ⚠️ Limited eval accessibility ⚠️ Poor scalability Introducing 🎵R&B: first regroup data, then dynamically reweight domains during training!

Online data mixing reduces training costs for foundation models, but faces challenges:
⚠️ Human-defined domains miss semantic nuances
⚠️ Limited eval accessibility
⚠️ Poor scalability

Introducing 🎵R&B: first regroup data, then dynamically reweight domains during training!
Alex Ratner (@ajratner) 's Twitter Profile Photo

Scale alone is not enough for AI data. Quality and complexity are equally critical. Excited to support all of these for LLM developers with Snorkel AI Data-as-a-Service, and to share our new leaderboard! — Our decade-plus of research and work in AI data has a simple point:

Snorkel AI (@snorkelai) 's Twitter Profile Photo

Foundation models are great at public knowledge, but fall short on domain-specific tasks. 📋 That’s why we’re working with the brightest in their fields to build high-quality training data that actually makes AI useful. Want to learn more? 👉 snorkel.ai/expert-communi…

Foundation models are great at public knowledge, but fall short on domain-specific tasks. 📋

That’s why we’re working with the brightest in their fields to build high-quality training data that actually makes AI useful.

Want to learn more? 👉 snorkel.ai/expert-communi…
Alex Ratner (@ajratner) 's Twitter Profile Photo

Our decade of work on AI data development has always been about *accelerating* the subject matter expert - not replacing them! Where automation is possible- saturation has been reached. The key to real AI delta is expert knowledge - which all comes down to the amazing experts!!

Snorkel AI (@snorkelai) 's Twitter Profile Photo

Huge thanks to Nasdaq for featuring our Series D! 👏 We’re using this momentum to solve the toughest data challenges in enterprise AI—from evaluation to expert curation. Building AI? Let’s talk data. 📈 #SnorkelAI #AIInfrastructure #LLMs #DataAsAService

Huge thanks to <a href="/Nasdaq/">Nasdaq</a> for featuring our Series D! 👏

We’re using this momentum to solve the toughest data challenges in enterprise AI—from evaluation to expert curation. 

Building AI? Let’s talk data. 📈

#SnorkelAI #AIInfrastructure #LLMs #DataAsAService
Eric Glyman (@eglyman) 's Twitter Profile Photo

Today, Ramp reached a new valuation: $16 billion Let the robots chase your receipts and close your books, so you can use your brain and build things. That's the way AI was meant to be.

Today, <a href="/tryramp/">Ramp</a> reached a new valuation: $16 billion

Let the robots chase your receipts and close your books, so you can use your brain and build things.

That's the way AI was meant to be.
Braden Hancock (@bradenjhancock) 's Twitter Profile Photo

A frequent evaluation mistake I see: assuming you need orders of magnitude more data than you actually do. What different evaluation set sizes are good for:

A frequent evaluation mistake I see: assuming you need orders of magnitude more data than you actually do.

What different evaluation set sizes are good for:
Snorkel AI (@snorkelai) 's Twitter Profile Photo

Three days out! 👏 We're going live with AI leaders from Accenture US, QBE, @Comcast, BNY & more. 🔹 Expert data + agentic AI 🔹 Live demos 🔹 Real-world use cases 🔹 Fresh research RSVP: snorkel.ai/events/develop… #AgenticAI #SnorkelAI

Three days out! 👏 We're going live with AI leaders from <a href="/Accenture_US/">Accenture US</a>, <a href="/QBE/">QBE</a>, @Comcast, <a href="/BNYglobal/">BNY</a> &amp; more.

🔹 Expert data + agentic AI
🔹 Live demos
🔹 Real-world use cases
🔹 Fresh research

RSVP: snorkel.ai/events/develop…

#AgenticAI #SnorkelAI
Snorkel AI (@snorkelai) 's Twitter Profile Photo

Highlights from Henry Kiss Ehrenberg’s theCUBE appearance on the future of AI: 🟦 Data strategy is key 🟦 Expert data drives real advantage 🟦 Trust & compliance are critical 👉 Full convo here: youtube.com/watch?v=Qjt-d9…

Jon Saad-Falcon (@jonsaadfalcon) 's Twitter Profile Photo

How can we close the generation-verification gap when LLMs produce correct answers but fail to select them? 🧵 Introducing Weaver: a framework that combines multiple weak verifiers (reward models + LM judges) to achieve o3-mini-level accuracy with much cheaper non-reasoning

How can we close the generation-verification gap when LLMs produce correct answers but fail to select them? 
🧵 Introducing Weaver: a framework that combines multiple weak verifiers (reward models + LM judges) to achieve o3-mini-level accuracy with much cheaper non-reasoning
Alex Ratner (@ajratner) 's Twitter Profile Photo

Very exciting work on using weak supervision for RL- closing the “generation-verification gap”!! Once again- principled approaches to labeling/data development are the keys!

Mayee Chen (@mayeechen) 's Twitter Profile Photo

LLMs often generate correct answers but struggle to select them. Weaver tackles this by combining many weak verifiers (reward models, LM judges) into a stronger signal using statistical tools from Weak Supervision—matching o3-mini-level accuracy with much cheaper models! 📊

LLMs often generate correct answers but struggle to select them. Weaver tackles this by combining many weak verifiers (reward models, LM judges) into a stronger signal using statistical tools from Weak Supervision—matching o3-mini-level accuracy with much cheaper models! 📊
Azalia Mirhoseini (@azaliamirh) 's Twitter Profile Photo

Introducing Weaver, a test time scaling method for verification! Weaver shrinks the generation-verification gap through a low-overhead weak-to-strong optimization of a mixture of verifiers (e.g., LM judges and reward models). The Weavered mixture can be distilled into a tiny

Introducing Weaver, a test time scaling method for verification! 

Weaver shrinks the generation-verification gap through a low-overhead weak-to-strong  optimization of a mixture of verifiers (e.g., LM judges and reward models). The Weavered mixture can be distilled into a tiny
Snorkel AI (@snorkelai) 's Twitter Profile Photo

We just dropped a benchmark dataset on Hugging Face to test AI agents on real-world insurance underwriting tasks—built with CPCU experts. Most models still struggle. Here’s how to evaluate them right: 🧠 Dataset: huggingface.co/datasets/snork…

Jieyu Zhang (@jieyuzhang20) 's Twitter Profile Photo

Tokenization kickstarts every Transformer pipeline—shaping how models digest data. Our latest work introduces semantic, grounded video tokenization, leveraging objectness cues to boost efficiency and performance of video understanding models.

Snorkel AI (@snorkelai) 's Twitter Profile Photo

Working with Amazon Web Services to push what’s possible in financial services: from AI agents for underwriting to data-driven copilots. When the data’s right, the system works. #SnorkelAI #AgenticAI #GenAI #AIinFinance

Working with <a href="/AWS/">Amazon Web Services</a> to push what’s possible in financial services: from AI agents for underwriting to data-driven copilots.

When the data’s right, the system works.

#SnorkelAI #AgenticAI #GenAI #AIinFinance