Darshan Deshpande (@getdarshan) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Darshan Deshpande

@getdarshan

3 years ago

Diffusion models are amazing! Want to know what makes them special? Join me at the TFUG event on the 4th of June to know more about them!

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Extending my presentation, I've written an article on Diffusion models that includes #JAX code and explains the math in detail. This will also be my Weights & Biases Blogathon submission! Check it out here 🤗: bit.ly/diffusing-away…

thumb_up_off_alt35

chat_bubble_outline0

repeat4

shareShare

Darshan Deshpande

@getdarshan

3 years ago

I am excited to announce that my notebook comparing the code and mathematics behind DDPMs and DDIMs won Kaggle's Google OSS Expert Prize! 🥳 If you are interested in diffusion models then you can find my notebook here: kaggle.com/code/darshan15… #MachineLearning #OpenSource

thumb_up_off_alt26

chat_bubble_outline3

repeat6

shareShare

Darshan Deshpande

@getdarshan

3 years ago

I know I've been inactive for a while but I wanted to put this out there. I've joined USC Viterbi School for my MSCS this Fall and am working with an amazing team at the Information Sciences Institute on some amazing NLP research. Couldn't ask for more 🤗

I know I've been inactive for a while but I wanted to put this out there. I've joined <a href="/USCViterbi/">USC Viterbi School</a> for my MSCS this Fall and am working with an amazing team at the Information Sciences Institute on some amazing NLP research. Couldn't ask for more 🤗

thumb_up_off_alt14

chat_bubble_outline3

repeat0

shareShare

Darshan Deshpande

@getdarshan

a year ago

🚀🔥 Excited to announce our NAACL-2024 paper introducing ✨SPARK✨, a novel framework leveraging large language models for generalizable and effective argument quality evaluation. Paper: arxiv.org/abs/2305.12280 #NLP #LLM #NAACL2024 #AI #MachineLearning

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

Darshan Deshpande

@getdarshan

a year ago

🆕 Curious about the origins of LLM alignment? Check my recent report that explores the topic in depth (with accompanying code for applying RLHF on Google AI's Gemma using PPO!) 🎉⚖️

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

PatronusAI

@patronusai

a year ago

1/ Introducing Lynx v1.1: an 8B State-of-the-Art RAG hallucination detection model 🚀 - Beats Claude-3.5-Sonnet on HaluBench by 3.0% - Outperforms GPT-4o on medical questions and answers by 6.8% - 1.4% higher accuracy than Lynx v1.0 on HaluBench Try it out on HuggingFace

thumb_up_off_alt128

chat_bubble_outline4

repeat26

shareShare

PatronusAI

@patronusai

a year ago

Llama Guard is Off Duty 😲 It’s weak at toxicity detection! We benchmarked popular toxicity datasets spanning languages like Portuguese, Ukrainian, and Turkish, and found that Llama Guard has a very high false negative rate for toxic content! We found that base models like

thumb_up_off_alt18

chat_bubble_outline1

repeat2

shareShare

NEC Laboratories Europe

@neclabseu

10 months ago

Prototype-based networks can greatly enhance the robustness of #languagemodels in text classification, addressing real-world needs by combining robustness & interpretability for #trustworthyAI. Learn how in our Findings of #EMNLP24 accepted paper. neclab.eu/research-group… #NECLabs

thumb_up_off_alt7

chat_bubble_outline0

repeat4

shareShare

Darshan Deshpande

@getdarshan

8 months ago

Hey everyone, I am at #EMNLP2024 this week, co-presenting our work on Prototype based Networks with Zhivar Sourati. Please reach out if you are interested in AI evaluations, interpretability or model alignment!

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

PatronusAI

@patronusai

7 months ago

1/ Introducing Lynx v2.0: an 8B State-of-the-Art RAG hallucination detection model 🚀 - Beats Claude-3.5-Sonnet on HaluBench by 2.2% - 3.4% higher accuracy than Lynx v1.1 on HaluBench - Optimized for long context use cases - Detects 8 types of common hallucinations, including

thumb_up_off_alt22

chat_bubble_outline3

repeat10

shareShare

Darshan Deshpande

@getdarshan

7 months ago

I am excited to announce the release of our Glider model - small size, multi metric evals, explainable highlight spans, multilingual generalization, amazing subjective metric performance - Check it out!! Paper: arxiv.org/abs/2412.14140…

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Darshan Deshpande

@getdarshan

7 months ago

I'm calling it right now - distilling reasoning chains is going to be the next big thing! ⛏️ OpenAI #OpenAi #o3

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Darshan Deshpande

@getdarshan

7 months ago

While experimenting with alignment methods, we observed that APO was more robust to noise in synthetic training data as compared to DPO or KTO. Thanks for the excellent contribution to the community Karel D’Oosterlinck and team 🚀

thumb_up_off_alt6

chat_bubble_outline1

repeat1

shareShare

PatronusAI

@patronusai

3 months ago

1/ Ever tried to remember the name of a movie you’ve seen – you can picture the scenes clearly, but the movie name won’t come to you? Introducing BLUR: the first agent benchmark for tip-of-the-tongue search and reasoning 🔥 We benchmarked SOTA agents and found that the

thumb_up_off_alt45

chat_bubble_outline1

repeat6

shareShare

PatronusAI

@patronusai

3 months ago

We're excited to introduce the BLUR Leaderboard on Hugging Face 🔥 Earlier today, we open sourced BLUR: the first agent benchmark for tip-of-the-tongue search and reasoning. It measures how effectively agents can help you identify something you vaguely remember, but can’t

thumb_up_off_alt43

chat_bubble_outline2

repeat11

shareShare

Annie Franco

@anniefranco

3 months ago

Building good benchmarks is hard, and PatronusAI has released what may be the coolest agent eval yet: ✅ Realistic and objectively useful task ✅ Multilingual, multimodal, and multi-domain ✅ Easy for humans, still challenging for agents

thumb_up_off_alt6

chat_bubble_outline1

repeat4

shareShare

Annie Franco

@anniefranco

3 months ago

My colleague Chris McConnell and I greatly enjoyed seeing Sky CH. Wang Darshan Deshpande Rebecca Qian Anand Kannappan bring this project to life. We’re excited to finally see it out in the world, and look forward to collaborating on the next one!

thumb_up_off_alt4

chat_bubble_outline1

repeat2

shareShare

Darshan Deshpande

@getdarshan

2 months ago

Non-deterministic trajectories need autonomous supervision. Introducing Percival, a SoTA system to detect issues with long context agentic problems and suggest fixes to systems. The time to make a move towards autonomous evaluations is now! 🔥

thumb_up_off_alt10

chat_bubble_outline1

repeat4

shareShare

Clémentine Fourrier 🍊

@clefourrier

2 months ago

Check out the very cool work from our friends PatronusAI 🔥 work here! huggingface.co/spaces/Patronu…

thumb_up_off_alt17

chat_bubble_outline1

repeat7

shareShare