Faisal Ladhak (@faisalladhak) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

I've done a deep dive into SB 1047 over the last few weeks, and here's what you need to know: *Nobody* should be supporting this bill in its current state. It will *not* actually cover the largest models, nor will it actually protect open source. But it can be easily fixed!🧵

thumb_up_off_alt478

chat_bubble_outline10

repeat100

shareShare

Benjamin Clavié

@bclavie

a year ago

🥁🥁 New blog post out (link in thread), w/ two aims: 🤓 Providing a clear, hopefully easy-to-read intro to ColBERT, without assuming you've ever used it. 🏊Introducing ColBERT Token Pooling ✨: You can reduce the size of ColBERT indexes by 66% with barely any performance hit!

thumb_up_off_alt373

chat_bubble_outline7

repeat68

shareShare

Griffin Adams

@griffinadams92

a year ago

New Conference on Language Modeling paper with Noémie Elhadad! SPEER: Sentence-Level Planning of Long Clinical Summaries via Embedded Entity Retrieval arxiv.org/abs/2401.02369 *tl;dr* We introduce R^3 decoding--integrating RAG into planning--to improve faithfulness & coverage on long summarization!

thumb_up_off_alt35

chat_bubble_outline3

repeat9

shareShare

Esin Durmus

@esindurmusnlp

a year ago

Will present this work on testing the global opinions represented in language models at Conference on Language Modeling 🌍

thumb_up_off_alt40

chat_bubble_outline2

repeat5

shareShare

Jeremy Howard

@jeremyphoward

a year ago

Someone noticed our not-quite-launched new lib for WebGPU programming on GitHub and now it's on the front page of HN! It's created by Austin Huang and he'll be publishing a blog post about it very soon. But since it's out in the open now, here you go :D github.com/AnswerDotAI/gp…

thumb_up_off_alt598

chat_bubble_outline3

repeat95

shareShare

Austin Huang

@austinvhuang

a year ago

Announcing: The initial release of my 1st project since joining the amazing team here at Answer.AI gpu.cpp Portable C++ GPU compute using WebGPU Links + info + a few demos below 👇

thumb_up_off_alt1,1K

chat_bubble_outline23

repeat173

shareShare

Alexis Gallagher

@alexisgallagher

a year ago

Can LLMs reason and solve other multi-step tasks? They sound like they can but they often fail wildly. I've written a post on Yejin Choi team's "Faith and Fate" paper, which provides a great intuition for this, arguing what models ARE doing is *linearized subgraph matching*.

Can LLMs reason and solve other multi-step tasks? They sound like they can but they often fail wildly.

I've written a post on <a href="/YejinChoinka/">Yejin Choi</a> team's "Faith and Fate" paper, which provides a great intuition for this, arguing what models ARE doing is *linearized subgraph matching*.

thumb_up_off_alt401

chat_bubble_outline11

repeat86

shareShare

Jeremy Howard

@jeremyphoward

a year ago

Announcing FastHTML. A new way to create modern interactive web apps. Scales down to a 6-line python file; scales up to complex production apps. Auth, DBs, caching, styling, etc built-in & replaceable and extensible. 1-click deploy to Railway, Vercel, Hugging Face, & more.

thumb_up_off_alt5,5K

chat_bubble_outline171

repeat700

shareShare

Griffin Adams

@griffinadams92

a year ago

Announcing Cold Compress 1.0 with Answer.AI A hackable toolkit for using and creating KV cache compression methods. Built on top of Horace He and Team’s GPT-Fast for torch.compilable, light-weight performance. Develop novel methods in as little as 1 line of new code.

Announcing Cold Compress 1.0 with <a href="/answerdotai/">Answer.AI</a>

A hackable toolkit for using and creating KV cache compression methods.

Built on top of <a href="/cHHillee/">Horace He</a> and Team’s GPT-Fast for torch.compilable, light-weight performance.

Develop novel methods in as little as 1 line of new code.

thumb_up_off_alt156

chat_bubble_outline3

repeat44

shareShare

Esin Durmus

@esindurmusnlp

10 months ago

We'll present this at #NeurIPS

thumb_up_off_alt307

chat_bubble_outline8

repeat13

shareShare

Karina Nguyen

@karinanguyen_

10 months ago

My vision for the ultimate AGI interface is a blank canvas. The one that evolves, self-morphs over time with human preferences and invents novel ways of interacting with humans, redefining our relationship with AI technology and the entire Internet. But here are some of the

thumb_up_off_alt2,2K

chat_bubble_outline95

repeat257

shareShare

Vik Paruchuri

@vikparuchuri

10 months ago

Announcing Surya Table Recognition! It uses a new architecture to outperform table transformer, the current SoTA open source model. - Recognizes table rows, columns, and cells - Works with complex layouts and rotated tables - Supports any language - Runs locally

thumb_up_off_alt821

chat_bubble_outline22

repeat93

shareShare

Dario Amodei

@darioamodei

10 months ago

Machines of Loving Grace: my essay on how AI could transform the world for the better darioamodei.com/machines-of-lo…

thumb_up_off_alt5,5K

chat_bubble_outline0

repeat1,1K

shareShare

Benjamin Warner

@benjamin_warner

10 months ago

This gradient accumulation implementation bug doesn't affect all training frameworks. For example, Composer has the accumulate_train_batch_on_tokens option, which should prevent this issue. I would be surprised if other training frameworks didn't have similar options.

thumb_up_off_alt18

chat_bubble_outline0

repeat6

shareShare

Nathan Cooper

@ncooper57

10 months ago

As a lead researcher at @stabilityai, I worked a lot with synthetic data to train LLMs and VLMs. It is the most underrated way of boosting model performance. Now at Answer.AI I've been working to make the best practices easy—read on to learn how!

thumb_up_off_alt617

chat_bubble_outline16

repeat82

shareShare

Rada Mihalcea

@radamihalcea

10 months ago

The new GSM-Symbolic paper from Apple has been making waves, but we published very similar findings earlier this year. Using nearly the same symbolic template methodology on GSM8k problems, we demonstrated the reasoning limitations of LLMs. arxiv.org/pdf/2401.09395

thumb_up_off_alt278

chat_bubble_outline7

repeat73

shareShare

Anthropic

@anthropicai

9 months ago

Introducing an upgraded Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku. We’re also introducing a new capability in beta: computer use. Developers can now direct Claude to use computers the way people do—by looking at a screen, moving a cursor, clicking, and typing text.

thumb_up_off_alt10,10K

chat_bubble_outline484

repeat1,1K

shareShare

Anthropic

@anthropicai

9 months ago

New Anthropic research: Evaluating feature steering. In May, we released Golden Gate Claude: an AI fixated on the Golden Gate Bridge due to our use of “feature steering”. We've now done a deeper study on the effects of feature steering. Read the post: anthropic.com/research/evalu…

thumb_up_off_alt1,1K

chat_bubble_outline40

repeat188

shareShare

Esin Durmus

@esindurmusnlp

9 months ago

Excited to share my new research on evaluating feature steering: I ran quantitative evaluations on how steering specific features affects model behavior. I identified a 'sweet spot' for maintaining capabilities, and found both targeted and off-target effects on social biases 🎯

thumb_up_off_alt183

chat_bubble_outline5

repeat17

shareShare

Eugene Bagdasarian

@ebagdasa

9 months ago

🧙 I am recruiting PhD students and postdocs to work together on making sure AI Systems and Agents are built safe and respect privacy (+ other social values). Apply to UMass Amherst Manning College of Information & Computer Sciences and enjoy a beautiful town in Western Massachusetts. Reach out if you have questions!

thumb_up_off_alt78

chat_bubble_outline0

repeat25

shareShare