Hanlin Tang (@hanlintang) Twitter Tweets • TwiCopy

Awni Hannun

@awnihannun

a year ago

A little snapshot of number of pretraining tokens for open source LLMS over the past year. Interesting trend:

thumb_up_off_alt173

chat_bubble_outline6

repeat22

shareShare

clem 🤗

@clementdelangue

a year ago

Not a surprise but DBRX is already #1 trending on HF!

thumb_up_off_alt184

chat_bubble_outline6

repeat20

shareShare

Denis Yarats

@denisyarats

a year ago

dbrx-instruct by Databricks Mosaic Research is available on labs.perplexity.ai, enjoy!

dbrx-instruct by <a href="/DbrxMosaicAI/">Databricks Mosaic Research</a> is available on labs.perplexity.ai, enjoy!

thumb_up_off_alt43

chat_bubble_outline0

repeat3

shareShare

Absolutely not. The rules of the game are: If released the model: - You can to compare yourself to other open models as much as you want. If you don’t release the model: - Compare yourself also to GPT-4 before claiming who you outperform or not. Comparing only down: Not fair.

thumb_up_off_alt41

chat_bubble_outline3

repeat6

shareShare

Abhi Venigalla

@ml_hardware

a year ago

This is literally my new LK-99 🙏🙏🙏

thumb_up_off_alt315

chat_bubble_outline6

repeat27

shareShare

Hanlin Tang

@hanlintang

a year ago

Domain-specific benchmarks matter for enterprise, glad to see DBRX working well. Julia Neagu building some interesting enterprise evals!

thumb_up_off_alt22

chat_bubble_outline0

repeat3

shareShare

MatthewBerman

@matthewberman

a year ago

DBRX by Databricks ...it's REALLY good!! The New MoE 132b parameter model is open-source and costs $10 m to train. Thank you, Databricks, for your contribution to OS. Check out the full explanation and testing: 🎥👇

thumb_up_off_alt353

chat_bubble_outline8

repeat45

shareShare

Hanlin Tang

@hanlintang

a year ago

"Let's think step by step, using colorful physical analogies..."

thumb_up_off_alt4

chat_bubble_outline1

repeat0

shareShare

Bill Yuchen Lin

@billyuchenlin

a year ago

🆕 Check out the recent update of 𝕎𝕚𝕝𝕕𝔹𝕖𝕟𝕔𝕙! We have included a few more models including DBRX-Instruct Databricks and StarlingLM-beta (7B) Nexusflow which are both super powerful! DBRX-Instruct is indeed the best open LLM; Starling-LM 7B outperforms a lot of even

🆕 Check out the recent update of 𝕎𝕚𝕝𝕕𝔹𝕖𝕟𝕔𝕙! We have included a few more models including DBRX-Instruct <a href="/databricks/">Databricks</a> and StarlingLM-beta (7B) <a href="/NexusflowX/">Nexusflow</a> which are both super powerful! DBRX-Instruct is indeed the best open LLM; Starling-LM 7B outperforms a lot of even

thumb_up_off_alt122

chat_bubble_outline3

repeat30

shareShare

jasmine collins

@jazco

a year ago

we all know how important LLM evaluation is.. 🤔 i’m excited to FINALLY announce that we are starting a new 📢 recipe-based evals team!!! 📢 for our first study, we compared 5 LLM-generated chili recipes with the prompt: “Give me a chili recipe with an interesting twist” (1/n)

thumb_up_off_alt146

chat_bubble_outline6

repeat17

shareShare

Hanlin Tang

@hanlintang

a year ago

Good fast evals are the way. Excited for what Julia Neagu and Freddie Vargus have cooking at Quotient AI !

thumb_up_off_alt15

chat_bubble_outline2

repeat1

shareShare

virat

@virattt

a year ago

Friday is LLM battle day. I added DBRX to the financial metrics challenge. Overall, very impressed with DBRX. Main takeaways: • correctly calculated metrics • ranked top 4 fastest models • competitive pricing DBRX was +50% cheaper and +100% faster than models in its tier.

thumb_up_off_alt364

chat_bubble_outline13

repeat73

shareShare

Tessa Barton

@tessybarton

a year ago

LLM evals are a mess! They are noisy, inconsistent, and contradictory. Scaling laws on the other hand have consistently held up to increasing scrutiny. Can we use the reliability of scaling laws to predict the quality of our eval benchmarks?

thumb_up_off_alt251

chat_bubble_outline9

repeat38

shareShare

Sasha Doubov

@sashadoubov

a year ago

me crafting a training yaml at work (it’s going to crash immediately)

thumb_up_off_alt45

chat_bubble_outline3

repeat3

shareShare

Ali Ghodsi

@alighodsi

a year ago

Databricks to acquire Tabular (now part of Databricks), a data platform from the original creators of Apache Iceberg. Together, we will bring format compatibility to the lakehouse for Delta Lake and Apache Iceberg databricks.com/blog/databrick…

thumb_up_off_alt370

chat_bubble_outline11

repeat84

shareShare

Matei Zaharia

@matei_zaharia

a year ago

This Monday in SF, our MosaicX meetup is essentially a mini-conference on AI! Hear from Jonathan Frankle, Sharon Zhou, Jerry Liu, Sarah Catanzaro, Julia Neagu, Yaron, jasmine collins and many others on the latest research.

thumb_up_off_alt71

chat_bubble_outline3

repeat20

shareShare

Matei Zaharia

@matei_zaharia

a year ago

Super excited about the new Agent Framework, Tool Catalog, Vector Search, Evaluation and Training capabilities we launched today in Mosaic AI. We see more companies building compound AI systems, and we have created an end-to-end environment to do this. databricks.com/blog/mosaic-ai…

thumb_up_off_alt236

chat_bubble_outline2

repeat50

shareShare

Patrick Wendell

@pwendell

a year ago

Meta AI's Llama release today is really important, likely the most important open source AI announcement ever. Many people don't understand why: 1. The quality gap between the best proprietary and open models has effectively vanished. No one really knew if this gap would get

thumb_up_off_alt422

chat_bubble_outline11

repeat81

shareShare

Michael Bendersky

@bemikelive

2 months ago

This is a good opportunity to announce that I recently joined the research team at Databricks where I will be working alongside Jonathan Frankle Rishabh Singh Matei Zaharia Erich Elsen, and many others on the hardest problems at the intersection of information retrieval and AI.

thumb_up_off_alt35

chat_bubble_outline1

repeat6

shareShare

Freddie Vargus

@freddie_v4

2 months ago

today we're releasing a new small model (0.5B) for detecting problems with tool usage in agents, trained on 50M tokens from publicly available MCP server tools it's great at picking up on tool accuracy issues and outperforms larger models

thumb_up_off_alt942

chat_bubble_outline14

repeat107

shareShare

Hanlin Tang

Awni Hannun

clem 🤗

Denis Yarats

Yam Peleg

Abhi Venigalla

Hanlin Tang

MatthewBerman

Hanlin Tang

Bill Yuchen Lin

jasmine collins

Hanlin Tang

virat

Tessa Barton

Sasha Doubov

Ali Ghodsi

Matei Zaharia

Matei Zaharia

Patrick Wendell

Michael Bendersky

Freddie Vargus