Wasi Ahmad (@ahmadwasi) Twitter Tweets • TwiCopy

Wasi Ahmad

2 years ago

🚀 Our latest research paper on code representation learning, CodeSage, outperforms OpenAI text-embedding-3-large on Code2Code search, and is on par with NL2Code search tasks! Dive into the techniques and insights - check them out on the blog: code-representation-learning.github.io

thumb_up_off_alt11

chat_bubble_outline1

repeat0

shareShare

Wasi Ahmad

@ahmadwasi

2 years ago

🚀 Dive into the cutting-edge research exploring keyphrase generation! The work delves deep into evaluating keyphrase generation on diversity, utility, faithfulness, and reference alignment.

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Wasi Ahmad

@ahmadwasi

2 years ago

Introducing RepoFormer! A repository-level code completion framework. The project was led by Di during his internship (summer'23) at AWS AI Labs. Read the paper to learn about the awesome work.

thumb_up_off_alt8

chat_bubble_outline0

repeat0

shareShare

Astitva Srivastava

@sarcastitva

2 years ago

Computer Vision conference's acceptance criteria these days: #CVPR2024 #eccc2024 #AI #ComputerVision

thumb_up_off_alt2,2K

chat_bubble_outline42

repeat381

shareShare

Wasi Ahmad

@ahmadwasi

2 years ago

🔥 Introducing IllusionVQA 🔥 A challenging dataset to test VLMs' ability to locate and comprehend optical illusions. While humans achieved near perfect accuracy, GPT4V, the best-performing VLM, achieved 63% and 49.7% accuracy on the comprehension and localization tasks.

thumb_up_off_alt9

chat_bubble_outline0

repeat1

shareShare

Wasi Ahmad

@ahmadwasi

a year ago

🚀 Introducing “Repoformer: Selective Retrieval for Repository-Level Code Completion” accepted at #ICML2024.

thumb_up_off_alt11

chat_bubble_outline0

repeat0

shareShare

Arif's Den 𝕏

@arifsden

a year ago

My twitter network Please help me to spread this. #SaveBangladeshStudents #SaveBangladeshiStudents #StudentProtest #StarlinkforBangladesh

thumb_up_off_alt1,1K

chat_bubble_outline558

repeat1,1K

shareShare

Di Wu

@diwu0162

a year ago

Introducing LongMemEval: a comprehensive, challenging, and scalable benchmark for testing the long-term memory of chat assistants. 📊 LongMemEval features: • 📝 164 topics • 💡 5 core memory abilities • 🔍 500 manually created questions • ⏳ Freely extensible chat history

thumb_up_off_alt72

chat_bubble_outline1

repeat29

shareShare

Wasi Ahmad

@ahmadwasi

8 months ago

Direct RL without SFT improved pass@1 from 61.6 to 84.1. Do you have numbers of SFT using the released 89k data?

thumb_up_off_alt2

chat_bubble_outline2

repeat0

shareShare

Terry Yue Zhuo

@terryyuezhuo

8 months ago

Today, we announce a collaboration between SWE Arena (Computer Intelligence) and Hugging Face (w/ Gradio). We believe that Hugging Face can help us shape the future of AI Software Engineering evaluations. We have now open-sourced the SWE Arena codebase to accelerate the development of

Today, we announce a collaboration between SWE Arena (<a href="/BigComProject/">Computer Intelligence</a>) and <a href="/huggingface/">Hugging Face</a> (w/ <a href="/Gradio/">Gradio</a>). We believe that <a href="/huggingface/">Hugging Face</a> can help us shape the future of AI Software Engineering evaluations.

We have now open-sourced the SWE Arena codebase to accelerate the development of

thumb_up_off_alt85

chat_bubble_outline4

repeat21

shareShare

Mostofa Patwary

@mapatwary

6 months ago

Nemotron-H base models (8B/47B/56B): A family of Hybrid Mamba-Transformer LLMs are now available on HuggingFace: huggingface.co/nvidia/Nemotro… huggingface.co/nvidia/Nemotro… huggingface.co/nvidia/Nemotro… Technical Report: arxiv.org/abs/2504.03624 Blog: research.nvidia.com/labs/adlr/nemo…

thumb_up_off_alt27

chat_bubble_outline1

repeat12

shareShare

Aran Komatsuzaki

@arankomatsuzaki

5 months ago

Nvidia presents Llama-Nemotron: Efficient Reasoning Models - An open family of models w/ exceptional reasoning capabilities and inference efficiency - Discusses the training procedure, incl. NAS from Llama 3 for accelerated inference, knowledge distillation, and continued

thumb_up_off_alt300

chat_bubble_outline2

repeat48

shareShare

Wasi Ahmad

@ahmadwasi

3 months ago

Consider submitting your work at DL4C.

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Wasi Ahmad

@ahmadwasi

2 months ago

Please consider contributing to the workshop as a reviewer.

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Wasi Ahmad

@ahmadwasi

24 days ago

Tested K2-Think on LiveCodeBench (v6) — 2408:2505 (454 samples). Got pass@1[avg-of-10] = 60.8%, vs. Nemotron-32B at 69.8%. Disappointing to see such inflated/false scores being reported.

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare