Ehsan Kamalloo (@ehsk0) 's Twitter Profile
Ehsan Kamalloo

@ehsk0

Research Scientist @ServiceNowRSRCH

ID: 1663072824

linkhttp://ehsk.github.io calendar_today11-08-2013 17:51:55

177 Tweet

326 Followers

592 Following

Spandana Gella (@gspandana) 's Twitter Profile Photo

Internship ServiceNow Research to build the next generation of computer use agents that are safe and secure from malicious attacks. Focus on intervention strategies, defenses to make agents robust against unsafe behavior.. Apply here: bit.ly/3V3mmTg

Alexandre L.-Piché (@alexpiche_) 's Twitter Profile Photo

Glad to see OpenAI prioritizing abstention responses in their paper! That's a great intro to our TMLR paper in which we developed an iterative self-reflection method for LLM to know when to abstain without ground truth and no additional cost at test time. openreview.net/pdf?id=SvKPfch…

Sai Rajeswar (@rajeswarsai) 's Twitter Profile Photo

💡So far, I have been sharing our multimodal AI research at ServiceNow focused on reasoning over pixels. Today, we share a new chapter with an open-source release of our big initiative in the voice and speech domain.🚀 🎧 AU-Harness: Holistic Evaluation of Audio LLM Responses

💡So far, I have been sharing our multimodal AI research at <a href="/ServiceNow/">ServiceNow</a> focused on reasoning over pixels. Today, we share a new chapter with an open-source release of our big initiative in the voice and speech domain.🚀

🎧 AU-Harness: Holistic Evaluation of Audio LLM Responses
Nouha Dziri (@nouhadziri) 's Twitter Profile Photo

[New work🥁] Can RL actually teach NEW solutions, or is it just polishing what already the model learnt in pre-training/mid-training/post-training? 🤔 🧵👇 Can models truly be creative with incredibly challenging problems e.g., math, code, etc This has been the big question

[New work🥁] Can RL actually teach NEW solutions, or is it just polishing what already the model learnt in pre-training/mid-training/post-training? 🤔 🧵👇

Can models truly be creative with incredibly challenging problems e.g., math, code, etc

This has been the big question
ServiceNow Research (@servicenowrsrch) 's Twitter Profile Photo

SLAM Labs presents Apriel-1.5-15B-Thinker 🚀 An open-weights multimodal reasoning model that hits frontier-level performance with just a fraction of the compute.

SLAM Labs presents Apriel-1.5-15B-Thinker 🚀

An open-weights multimodal reasoning model that hits frontier-level performance with just a fraction of the compute.
ServiceNow Research (@servicenowrsrch) 's Twitter Profile Photo

Don’t miss the Expo Talk by Alex Piché & Dzmitry Bahdanau at #COLM2025! 📢 Fast On-Policy RL for Long Sequence Generation 📍Oct 9 | 1–2:30 PM | Room 523A-B They’ll present PipelineRL: ⚡2x faster learning on long-form reasoning (128 H100s) ⚡Fresh, on-policy data

Nouha Dziri (@nouhadziri) 's Twitter Profile Photo

🚀Ever wondered how to make RL work on impossible hard tasks where pass@k = 0%? 🤔 In our new work, we share the RL Grokking Recipe: a training recipe that enables LLMs to solve previously unsolvable coding problems! I will be at #CoLM2025 next week so happy to chat about it!

🚀Ever wondered how to make RL work on impossible hard tasks where pass@k = 0%? 🤔

In our new work, we share the RL Grokking Recipe: a training recipe that enables LLMs to solve previously unsolvable coding problems! I will be at #CoLM2025 next week so happy to chat about it!
ServiceNow Research (@servicenowrsrch) 's Twitter Profile Photo

🚀 New Research Blog Live! Our latest post is out: Unifying autoregressive & diffusion language models by Nima Fathi, Torsten Scholak, and Pierre-André Noël. 𝗔𝘂𝘁𝗼𝗿𝗲𝗴𝗿𝗲𝘀𝘀𝗶𝘃𝗲 and 𝗱𝗶𝗳𝗳𝘂𝘀𝗶𝗼𝗻 𝗺𝗼𝗱𝗲𝗹𝘀 have each driven major advances in generative AI — but

🚀 New Research Blog Live!

Our latest post is out: Unifying autoregressive &amp; diffusion language models by Nima Fathi, Torsten Scholak, and Pierre-André Noël.

𝗔𝘂𝘁𝗼𝗿𝗲𝗴𝗿𝗲𝘀𝘀𝗶𝘃𝗲 and 𝗱𝗶𝗳𝗳𝘂𝘀𝗶𝗼𝗻 𝗺𝗼𝗱𝗲𝗹𝘀 have each driven major advances in generative AI — but
vLLM (@vllm_project) 's Twitter Profile Photo

🚀 The RL community keeps pushing boundaries — from better on-policy data and partial rollouts to in-flight weight updates that mix KV caches across models during inference. Continuing inference while weights change and KV states stay stale sounds wild — but that’s exactly what

Alexandre L.-Piché (@alexpiche_) 's Twitter Profile Photo

Very excited to see vLLM supports Pipeline RL’s in-flight weight updates! It allowed our team to quickly and reliably train Qwen base 7B to reason from scratch! Want to hear more? Join us at our Pipeline RL expo talk at CoLM this Thursday 1PM room 524C.

Alexandre Lacoste (@alex_lacoste_) 's Twitter Profile Photo

🚨 Call for Interns – ServiceNow AI Research (Montreal) Our Computer-Use Agents team (Frontier AI Research) is recruiting interns for 2026! We work on LLMs and VLMs that can reliably use software and publishing at top venues (NeurIPS, ICML, ICLR) and developing open-source

ServiceNow Research (@servicenowrsrch) 's Twitter Profile Photo

🎉 It’s CoLM week! The Conference on Language Modeling (CoLM 2025) kicks off tomorrow in Montréal 🇨🇦🍁 Proud that ServiceNow AI Research is a main sponsor — and that our team will present 5 papers on: 📊 Multimodal reasoning 🔄 Unified AR & diffusion models 🔍 Dense retrieval

Torsten Scholak (@tscholak) 's Twitter Profile Photo

🧠 Call for Interns – ServiceNow AI Research (Montreal) Our Foundation Models Lab is recruiting interns for 2026! We train & optimize LLMs, from diffusion-based generation to state-space hybrids. If you care about efficient LLMs, diffusion or reasoning → this is for you. 🧵👇

🇺🇦 Dzmitry Bahdanau (@dbahdanau) 's Twitter Profile Photo

We did lots of good work since PipelineRL release in May: ⚙️ higher throughput, seq parallel training, multimodal, agentic RL 📜 white paper with great explanations and results: arxiv.org/pdf/2509.19128… We'll present today at CoLM EXPO, room 524C, 1pm!

Alexandre L.-Piché (@alexpiche_) 's Twitter Profile Photo

Very excited to be presenting Pipeline RL this afternoon at CoLM. Join us if you are interested in fast on policy RL training for LLMs 🚀

Issam Laradji (@ilaradji) 's Twitter Profile Photo

🚀 Releasing DRBench, an Enterprise-Grade Deep Research Benchmark Paper! 📄 Paper: lnkd.in/gpRXbb7K 💻 Code: lnkd.in/g4-x5EDc We’re excited to introduce DRBench, the first benchmark designed to evaluate deep research agents on open-ended enterprise research tasks,

🚀 Releasing DRBench, an Enterprise-Grade Deep Research Benchmark Paper!

📄 Paper: lnkd.in/gpRXbb7K
💻 Code: lnkd.in/g4-x5EDc

We’re excited to introduce DRBench, the first benchmark designed to evaluate deep research agents on open-ended enterprise research tasks,
Alexandre Drouin (@alexandredrouin) 's Twitter Profile Photo

Excited to speak at the AAAI-26 Workshop on Agentic AI Benchmarks & Enterprise Tasks (Jan 26, Singapore) 🇸🇬 As agents are rapidly productized, realistic enterprise benchmarks for capabilities and reliability are essential! Submit: openreview.net/group?id=AAAI.… 🗓️ Oct 29 cc Graham Neubig

Alexandre L.-Piché (@alexpiche_) 's Twitter Profile Photo

In-flight weight updates have gone from a “weird trick” to a must to train LLMs with RL in the last few weeks. If you want to understand the on-policy and throughput benefits here’s the CoLM talk 🇺🇦 Dzmitry Bahdanau and I gave: youtu.be/Z1uEuRKACRs

ServiceNow Research (@servicenowrsrch) 's Twitter Profile Photo

ServiceNow AI Research presents PipelineRL — one of the most impactful efficiency tricks in modern RL training. An elegant solution to a noisy, expensive problem. Worth the read 👇