Yu Zhao (@yuzhaouoe) Twitter Tweets • TwiCopy

Rohit Saxena

6 months ago

Can multimodal LLMs truly understand research poster images?📊 🚀 We introduce PosterSum—a new multimodal benchmark for scientific poster summarization! 🪧 📂 Dataset: huggingface.co/datasets/rohit… 📜 Paper: arxiv.org/abs/2502.17540

thumb_up_off_alt81

chat_bubble_outline2

repeat24

shareShare

Rohit Saxena

@rohit_saxena

6 months ago

📣This work will appear at the ICLR 2025 Workshop on Reasoning and Planning for LLMs.🇸🇬 I'm currently on the job market, looking for research scientist roles. Feel free to reach out if you're hiring or know of any opportunities!

thumb_up_off_alt21

chat_bubble_outline0

repeat11

shareShare

Tongyao Zhu @ ICLR 25 🇸🇬

@tongyao_zhu

5 months ago

🚀 Excited to share our new paper: SkyLadder: Better and Faster Pretraining via Context Window Scheduling! Have you ever noticed the ever-increasing⬆context window of pretrained language models? The first generation of GPT had a context length of 512, followed by 1024 for GPT2,

thumb_up_off_alt111

chat_bubble_outline2

repeat22

shareShare

Pasquale Minervini is hiring postdocs! 🚀

@pminervini

5 months ago

Still ~10 days to apply! Fully funded until 2029, and includes access to our HPC infra and some amazing teams! EdinburghNLP

thumb_up_off_alt16

chat_bubble_outline0

repeat8

shareShare

Mohd Sanad Zaki Rizvi

@sanad_maker

5 months ago

🚀 New ArXiv paper alert! By combining agentic frameworks (ReAct) with smart decoders (DeCoRe, DoLa, CAD), we boost factual accuracy in complex reasoning tasks —reducing those annoying hallucinations! 🔥 🔗 Paper: arxiv.org/abs/2503.23415 1\n

thumb_up_off_alt8

chat_bubble_outline1

repeat5

shareShare

Piotr Miłoś

@piotrrmilos

5 months ago

My good friend has an ongoing fight with cancer. A great father and husband for his family. An excellent co-author for me and many other ML folks. Please support and share! (link in the comment!)

thumb_up_off_alt34

chat_bubble_outline1

repeat13

shareShare

Pasquale Minervini is hiring postdocs! 🚀

@pminervini

5 months ago

My amazing collaborators will present several works at ICLR and NAACL later this month -- please catch up with them if you're attending! I tried to summarise our recent work in a blog post: neuralnoise.com/2025/march-res…

thumb_up_off_alt50

chat_bubble_outline0

repeat13

shareShare

Hongru Wang

@wangcarrey

4 months ago

💥 We are so excited to introduce OTC-PO, the first RL framework for optimizing LLMs’ tool-use behavior in Tool-Integrated Reasoning. Arxiv: arxiv.org/pdf/2504.14870 Huggingface: huggingface.co/papers/2504.14… ⚙️ Simple, generalizable, plug-and-play (just a few lines of code) 🧠

thumb_up_off_alt61

chat_bubble_outline3

repeat17

shareShare

Hongru Wang

@wangcarrey

4 months ago

🎉 Thrilled to share our TWO #NAACL2025 oral papers! 👇 Welcome to catch me and talk about anything! 1️⃣ Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering 📅 30 Apr • 11:30–11:45 AM • Ballroom C TLDR: A general representation learning

thumb_up_off_alt48

chat_bubble_outline0

repeat9

shareShare

Ne Luo (seeking PhD opportunities)

@neluo19

4 months ago

Hi! I will be attending #NAACL2025 and presenting our paper on self-training for tool-use today, an extended work of my MSc dissertation at EdinburghNLP, supervised by Pasquale Minervini is hiring postdocs! 🚀. Time: 14:00-15:30 Location: Hall 3 Let’s chat and connect!😊

Hi! I will be attending #NAACL2025 and presenting our paper on self-training for tool-use today, an extended work of my MSc dissertation at <a href="/EdinburghNLP/">EdinburghNLP</a>, supervised by <a href="/PMinervini/">Pasquale Minervini is hiring postdocs! 🚀</a>.

Time: 14:00-15:30
Location: Hall 3

Let’s chat and connect!😊

thumb_up_off_alt30

chat_bubble_outline1

repeat8

shareShare

Aryo Pradipta Gema

@aryopg

4 months ago

MMLU-Redux just touched down at #NAACL2025! 🎉 Wish I could be there for our "Are We Done with MMLU?" poster today (9:00-10:30am in Hall 3, Poster Session 7), but visa drama said nope 😅 If anyone's swinging by, give our research some love! Hit me up if you check it out! 👋

thumb_up_off_alt55

chat_bubble_outline0

repeat13

shareShare

Pasquale Minervini is hiring postdocs! 🚀

@pminervini

4 months ago

Featuring the one and only Nickil Maveli! 😊

Featuring the one and only <a href="/nickilmaveli/">Nickil Maveli</a>! 😊

thumb_up_off_alt26

chat_bubble_outline1

repeat5

shareShare

Aidan Clark

@_aidan_clark_

4 months ago

No LLM researcher should spent their whole life on one side of the pre/post training divide. The former teaches you what is actually happening, the latter reminds you what actually matters.

thumb_up_off_alt201

chat_bubble_outline6

repeat17

shareShare

Wenhao Yu

@wyu_nd

4 months ago

🚀 We release MMLongBench: Benchmark for evaluating long-context VLMs. 📊 13,331 examples across 5 tasks: – Visual RAG – Many-shot ICL – Needle-in-a-haystack – VL Summarization – Long-document VQA 📏 Lengths: 8 / 16 / 32 / 64 / 128K 🔍 Benchmarking both thoroughly & effectively!

thumb_up_off_alt126

chat_bubble_outline3

repeat24

shareShare

Zhaowei Wang

@zhaoweiwang4

4 months ago

Check our MMLongBench with comprehensive vision-language long-context applications!

thumb_up_off_alt6

chat_bubble_outline0

repeat3

shareShare

Zhaowei Wang

@zhaoweiwang4

4 months ago

🚨 New paper! 🚨 Many recent LVLMs claim massive context windows, but can they handle long contexts on diverse downstream tasks? 🤔 💡In our new paper, we find that most models still fall short! We introduce MMLongBench, the first comprehensive benchmark for long-context VLMs:

thumb_up_off_alt28

chat_bubble_outline1

repeat9

shareShare

Daniel Scalena

@daniel_sc4

3 months ago

💡 We compare prompting (zero and multi-shot + explanations) and inference-time interventions (ActAdd, REFT and SAEs). Following SpARE (Yu Zhao Alessio Devoto), we propose ✨ contrastive SAE steering ✨ with mutual info to personalize literary MT by tuning latent features 4/

💡 We compare prompting (zero and multi-shot + explanations) and inference-time interventions (ActAdd, REFT and SAEs).

Following SpARE (<a href="/yuzhaouoe/">Yu Zhao</a> <a href="/devoto_alessio/">Alessio Devoto</a>), we propose ✨ contrastive SAE steering ✨ with mutual info to personalize literary MT by tuning latent features 4/

thumb_up_off_alt5

chat_bubble_outline1

repeat1

shareShare

Anthropic

@anthropicai

3 months ago

Our interpretability team recently released research that traced the thoughts of a large language model. Now we’re open-sourcing the method. Researchers can generate “attribution graphs” like those in our study, and explore them interactively.

thumb_up_off_alt4,4K

chat_bubble_outline103

repeat576

shareShare