McGill NLP (@mcgill_nlp) Twitter Tweets • TwiCopy

Amirhossein Kazemnejad

a year ago

VinePPO, a straightforward modification to PPO, unlocks RL’s true potential for LLM Reasoning. It beats RL-free methods (DPO and RestEM) and PPO, surpassing it in less steps(up to 9x), less time(up to 3x), and less KL with half memory. Time to rethink RL post-training🧵: [1/n]

thumb_up_off_alt484

chat_bubble_outline7

repeat92

shareShare

Yisong Yue

@yisongyue

a year ago

Is Conference on Language Modeling already the best AI conference? Or am I not supposed to say that as ICLR 2026 general chair?

thumb_up_off_alt158

chat_bubble_outline5

repeat6

shareShare

Siva Reddy

@sivareddyg

a year ago

COLM 2025 will be in Montreal 🇨🇦! Looking forward to welcoming people working on all aspects of language models. See you in October 2025 Conference on Language Modeling

COLM 2025 will be in Montreal 🇨🇦! Looking forward to welcoming people working on all aspects of language models. See you in October 2025 <a href="/COLM_conf/">Conference on Language Modeling</a>

thumb_up_off_alt308

chat_bubble_outline2

repeat44

shareShare

David Ifeoluwa Adelani 🇳🇬

@davlanade

a year ago

Join my lab! I’m currently recruiting new students (MSc & PhD) for admission in the fall of 2025 at Mila - Institut québécois d'IA mila.quebec/en/prospective… Are you interested in multilingual NLP? I would encourage you to apply. Deadline: December 1

thumb_up_off_alt569

chat_bubble_outline13

repeat239

shareShare

Ian Porada

@ian_porada

a year ago

LLMs that "solve" challenge sets might still be relatively inaccurate at resolving diverse, attested instances of the same phenomenon. We show this in the case of Winograd schemas and other related pronominal ambiguities. In CoNLL 2024. #CoNLL2024 #EMNLP2024 #NLProc 1/

thumb_up_off_alt13

chat_bubble_outline1

repeat3

shareShare

Siva Reddy

@sivareddyg

10 months ago

I have multiple vacancies for PhD and Masters students at Mila - Institut québécois d'IA McGill NLP in NLP/ML focusing on representation learning, reasoning, multimodality and alignment. Deadline for applications is Dec 1st. More details: mila.quebec/en/prospective…

thumb_up_off_alt160

chat_bubble_outline1

repeat64

shareShare

Siva Reddy

@sivareddyg

9 months ago

I will be at #NeurIPS2024 Wed and Thu. Tomorrow at UBC for the Future of NLP event presenting "Learning to reason with Generative Models", covering post-training methods and inference time reasoning for LLMs and vision (diffusion) models. Happy to meet anyone interested!

thumb_up_off_alt52

chat_bubble_outline4

repeat9

shareShare

Conference on Language Modeling

@colm_conf

9 months ago

Announcement #1: our call for papers is up! 🎉 colmweb.org/cfp.html And excited to announce the COLM 2025 program chairs Yoav Artzi Eunsol Choi Ranjay Krishna and Aditi Raghunathan

Announcement #1: our call for papers is up! 🎉
colmweb.org/cfp.html
And excited to announce the COLM 2025 program chairs <a href="/yoavartzi/">Yoav Artzi</a> <a href="/eunsolc/">Eunsol Choi</a> <a href="/RanjayKrishna/">Ranjay Krishna</a> and <a href="/AdtRaghunathan/">Aditi Raghunathan</a>

thumb_up_off_alt165

chat_bubble_outline1

repeat42

shareShare

Arkil Patel

@arkil_patel

7 months ago

Presenting ✨ 𝐂𝐇𝐀𝐒𝐄: 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐧𝐠 𝐜𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐢𝐧𝐠 𝐬𝐲𝐧𝐭𝐡𝐞𝐭𝐢𝐜 𝐝𝐚𝐭𝐚 𝐟𝐨𝐫 𝐞𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧 ✨ Work w/ fantastic advisors 🇺🇦 Dzmitry Bahdanau and Siva Reddy Thread 🧵:

Presenting ✨ 𝐂𝐇𝐀𝐒𝐄: 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐧𝐠 𝐜𝐡𝐚𝐥𝐥𝐞𝐧𝐠𝐢𝐧𝐠 𝐬𝐲𝐧𝐭𝐡𝐞𝐭𝐢𝐜 𝐝𝐚𝐭𝐚 𝐟𝐨𝐫 𝐞𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧 ✨

Work w/ fantastic advisors <a href="/DBahdanau/">🇺🇦 Dzmitry Bahdanau</a> and <a href="/sivareddyg/">Siva Reddy</a>

Thread 🧵:

thumb_up_off_alt41

chat_bubble_outline1

repeat18

shareShare

Karolina Stanczak

@karstanczak

6 months ago

📢New Paper Alert!🚀 Human alignment balances social expectations, economic incentives, and legal frameworks. What if LLM alignment worked the same way?🤔 Our latest work explores how social, economic, and contractual alignment can address incomplete contracts in LLM alignment🧵

thumb_up_off_alt89

chat_bubble_outline1

repeat27

shareShare

Xing Han Lu

@xhluca

6 months ago

Agents like OpenAI Operator can solve complex computer tasks, but what happens when users use them to cause harm, e.g. automate hate speech and spread misinformation? To find out, we introduce SafeArena (safearena.github.io), a benchmark to assess the capabilities of web

thumb_up_off_alt78

chat_bubble_outline0

repeat34

shareShare

Parishad BehnamGhader

@parishadbehnam

6 months ago

Instruction-following retrievers can efficiently and accurately search for harmful and sensitive information on the internet! 🌐💣 Retrievers need to be aligned too! 🚨🚨🚨 Work done with the wonderful Nicholas Meade and Siva Reddy 🔗 mcgill-nlp.github.io/malicious-ir/ Thread: 🧵👇

thumb_up_off_alt42

chat_bubble_outline2

repeat16

shareShare

Nouha Dziri

@nouhadziri

6 months ago

Clock is ticking ⏳⏳submit your agent work to the first workshop for Agent Language Models #ACL2025NLP in Vienna 🎼🎶 We have an exciting lineup of speakers🔥 🗓️Deadline *March 31st* realm-workshop.github.io

thumb_up_off_alt34

chat_bubble_outline0

repeat7

shareShare

VLMs4All - CVPR 2025 Workshop

@vlms4all

6 months ago

📢Excited to announce our upcoming workshop - Vision Language Models For All: Building Geo-Diverse and Culturally Aware Vision-Language Models (VLMs-4-All) #CVPR2025 2025! 🌐 sites.google.com/view/vlms4all

thumb_up_off_alt48

chat_bubble_outline2

repeat20

shareShare

Sara Vera Marjanović

@saraveramarjano

5 months ago

Models like DeepSeek-R1 🐋 mark a fundamental shift in how LLMs approach complex problems. In our preprint on R1 Thoughtology, we study R1’s reasoning chains across a variety of tasks; investigating its capabilities, limitations, and behaviour. 🔗: mcgill-nlp.github.io/thoughtology/

thumb_up_off_alt227

chat_bubble_outline3

repeat62

shareShare

Siva Reddy

@sivareddyg

5 months ago

Introducing the DeepSeek-R1 Thoughtology -- the most comprehensive study of R1 reasoning chains/thoughts ✨. Probably everything you need to know about R1 thoughts. If we missed something, please let us know.

thumb_up_off_alt81

chat_bubble_outline0

repeat21

shareShare

Amirhossein Kazemnejad

@a_kazemnejad

5 months ago

Introducing nanoAhaMoment: Karpathy-style, single file RL for LLM library (<700 lines) - super hackable - no TRL / Verl, no abstraction💆‍♂️ - Single GPU, full param tuning, 3B LLM - Efficient (R1-zero countdown < 10h) comes with a from-scratch, fully spelled out YT video [1/n]

thumb_up_off_alt1,1K

chat_bubble_outline15

repeat164

shareShare

Xing Han Lu

@xhluca

5 months ago

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories We are releasing the first benchmark to evaluate how well automatic evaluators, such as LLM judges, can evaluate web agent trajectories. We find that rule-based evals underreport success rates, and