Ahmed Awadallah (@ahmedhawadallah) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Dimitris Papailiopoulos

@dimitrispapail

3 months ago

We’ve been cooking... a new open weights 14B Phi-4 reasoning model, SFT’d on ~1.4M carefully curated reasoning demonstrations from o3-mini and RL’d for a tiny bit. This model is a little beast.

thumb_up_off_alt1,1K

chat_bubble_outline37

repeat237

shareShare

I am thrilled to share our newest Phi models. This time we went all in on post-training to produce Phi-4-reasoning (SFT only) and Phi-4-reasoning-plus (SFT + a touch of RL) — both 14B models that pack a punch in a small size across reasoning and general purpose benchmarks🧵

thumb_up_off_alt77

chat_bubble_outline3

repeat21

shareShare

Mojan Javaheripi

@mojan_jp

3 months ago

Nice summary of more cool results for Phi-4-Reasoning by Dimitris Papailiopoulos

thumb_up_off_alt13

chat_bubble_outline0

repeat3

shareShare

Xeophon

@thexeophon

3 months ago

I got asked whether I can run the new Phi4 on my personal bench. And while I wanted to deprecate my benchmark (for various reasons, I think it is too simple and does not catch nuances like it used to), who am I to refuse this request. Super surprised at the numbers, gg MSFT!

thumb_up_off_alt41

chat_bubble_outline1

repeat4

shareShare

Philipp Schmid

@_philschmid

3 months ago

How can smaller LLMs achieve strong reasoning? By combining data curation with supervised fine-tuning (SFT) and targeted reinforcement learning (RL). Microsoft released their first open reasoning/thinking models with Phi-4-reasoning distilled from OpenAI o3-mini. Implementation

thumb_up_off_alt174

chat_bubble_outline7

repeat29

shareShare

Microsoft Research

@msftresearch

3 months ago

In this issue: New research on compound AI systems and causal verification of the Confidential Consortium Framework; release of Phi-4-reasoning; enriching tabular data with semantic structure, and more: msft.it/6012SVNCj

thumb_up_off_alt30

chat_bubble_outline0

repeat4

shareShare

Nathan

@nathanhabib1011

2 months ago

THINKING MODELS TOURNAMENT ARC 🧠📊 I ran open-source evals on some of the latest SOTA reasoning models Phi4 is the biggest surprise with insane results for only 14B! Claude isn’t the best reasoner... but crushes GPQA and simpleQA with sheer knowledge. Full results here 👇 1/N

thumb_up_off_alt29

chat_bubble_outline2

repeat9

shareShare

Bojan Vrhovnik

@bvrhovnik

2 months ago

Showcasing Phi-4-Reasoning: A Game-Changer for #AI #Developers buff.ly/1vVqebz

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

Steven Bathiche

@sbathiche

2 months ago

Phi-4 reasoning models are now available to download and run on the NPU on your Snapdragon-powered Copilot+ PC. azure.microsoft.com/en-us/blog/one…

thumb_up_off_alt21

chat_bubble_outline0

repeat6

shareShare

Ahmed Awadallah

@ahmedhawadallah

2 months ago

Two colleagues recently used our 14-billion parameters Phi-4-reasoning model to ace graduate-level Linear Algebra and Calculus BC tests—scoring 100% and 69/70 respectively. Thanks to the amazing work of our Windows + Devices colleagues, this model now runs on-device on

thumb_up_off_alt14

chat_bubble_outline1

repeat4

shareShare

Ahmed Awadallah

@ahmedhawadallah

2 months ago

A few months back, our team released Magentic-one -- showing how we can build multi-agent systems with AutoGen for complex web task completion. But how should humans interact with such systems? Magentic-UI shows how to build an agentic user experience, prioritizing

thumb_up_off_alt50

chat_bubble_outline0

repeat12

shareShare

Mojan Javaheripi

@mojan_jp

2 months ago

Great to see the additive dataset methodology we proposed in Phi-4-reasoning adopted in open-r1. Tldr: optimize data mixture per reasoning domain, and combine in final run for generalized performance. This is a game changer for reducing data ablation costs.

thumb_up_off_alt45

chat_bubble_outline0

repeat11

shareShare

Ahmed Awadallah

@ahmedhawadallah

2 months ago

Our team is releasing full evaluation logs (model generation, answer extractions, etc.) for 10 models in the Eureka Reasoning Models Study and also for Phi-4-reasoning and Phi-4-reasoning-plus (including reasoning traces) Hope this helps with research on transparency and

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

AutoGen

@pyautogen

a month ago

🚀 Introducing MCP Agents in Magentic-UI! Spin up custom agents that wrap one (or many) MCP tools, and let the Orchestrator pick the best agent for every step of the plan. Check out the demo below to see them in action 👇 #MCP #MagenticUI #AIagents

thumb_up_off_alt93

chat_bubble_outline3

repeat20

shareShare

AutoGen

@pyautogen

17 days ago

🚀 AutoGen v0.6.4 is out! Shout-out to GitHub Copilot for helping author these new features! 🧠 GraphFlow now retains execution state after termination, just like other group chats. Resets only when the graph fully completes. ⚙️ New parameter_override in Workbenches for

thumb_up_off_alt49

chat_bubble_outline0

repeat10

shareShare

Ahmed Awadallah

Gate.io

Dimitris Papailiopoulos

Suriya Gunasekar

Mojan Javaheripi

Xeophon

Philipp Schmid

Microsoft Research

Nathan

Bojan Vrhovnik

Steven Bathiche

Ahmed Awadallah

Ahmed Awadallah

Mojan Javaheripi

Ahmed Awadallah

AutoGen

AutoGen