Akshara Prabhakar (@aksh_555) 's Twitter Profile
Akshara Prabhakar

@aksh_555

applied scientist @SFResearch | prev @princeton_nlp, @surathkal_nitk

ID: 1188155157747388418

linkhttps://aksh555.github.io calendar_today26-10-2019 18:07:31

65 Tweet

394 Followers

700 Following

Akshara Prabhakar (@aksh_555) 's Twitter Profile Photo

Have a task that can be decomposed into two tasks requiring different skills? BUT - it is difficult to generate expert-curated training data? - do not want to use RAG? 🚀 Introducing LoRA Soups: Merging LoRAs for Practical Skill Composition Tasks 🔗 arxiv.org/abs/2410.13025 1/n

Have a task that can be decomposed into two tasks requiring different skills? BUT
- it is difficult to generate expert-curated training data?
- do not want to use RAG?

🚀 Introducing LoRA Soups: Merging LoRAs for Practical Skill Composition Tasks

🔗 arxiv.org/abs/2410.13025

1/n
Zuxin Liu (@liuzuxin) 's Twitter Profile Photo

Super interesting work & definitely check it out if you are attending NeurIPS! It reminds me the paper we published at ICLR this year — TAIL: Task-specific Adapters for Imitation Learning with Large Pretrained Models — addressing the lifelong learning problem for robotic agents.

Alessandro Sordoni (@murefil) 's Twitter Profile Photo

Super enjoyable read: promising results that model mixing via a small, learnable router on top of independently trained "skills" (parametrized as PEFT experts) can actually generalize better than data mixing (e.g. multi-task learning)

Weiran Yao (@iscreamnearby) 's Twitter Profile Photo

It actually reminds me of the multi-source domain adaptation work - where knowing domain index during training makes the style concepts represented well (theoretically componentwisely disentangled using a lot of source domains with sufficient variations). Then one can just

Tao Yu (@taoyds) 's Twitter Profile Photo

Text-to-SQL has been my passion since Yale Spider 1.0! But as LLMs master it, real-world complexity demands more. 🚀After a year of work, Spider 2.0 shows the gap: o1 achieves just 17%! The path to production deployment is still long but exciting! more👉spider2-sql.github.io

Text-to-SQL has been my passion since Yale Spider 1.0! But as LLMs master it, real-world complexity demands more.

🚀After a year of work, Spider 2.0 shows the gap: o1 achieves just 17%!

The path to production deployment is still long but exciting!

more👉spider2-sql.github.io
Salesforce AI Research (@sfresearch) 's Twitter Profile Photo

🇨🇦🇨🇦🇨🇦 Welcome to Vancouver! 🇨🇦🇨🇦🇨🇦 13 Paper links below! 👇 The @Salesforce AI Research team brought a baker's dozen AI Research advancements to #NeurIPS2024 this year -- from revolutionizing multimodal agents and time series forecasting to tackling responsible AI evaluation

🇨🇦🇨🇦🇨🇦 Welcome to Vancouver! 🇨🇦🇨🇦🇨🇦
13 Paper links below! 👇

The @Salesforce AI Research team brought a baker's dozen AI Research advancements to #NeurIPS2024 this year -- from revolutionizing multimodal agents and time series forecasting to tackling responsible AI evaluation
Zuxin Liu (@liuzuxin) 's Twitter Profile Photo

🚀 Introducing our #NeurIPS'24 (D&B track) paper, APIGen - an Automated PIpeline for Generating high-quality agentic data. While I cann't attend due to visa issues, my brilliant colleagues Jianguo Zhang @TeeH912 Haolin Chen Akshara Prabhakar will be there. Swing by our booth or the

Salesforce AI Research (@sfresearch) 's Twitter Profile Photo

🤖 Fresh from #NeurIPS2024: Our AI research scientist Akshara Prabhakar Akshara Prabhakar discusses our demo of xLAM's specialized agents (customer, search, cleanup) collaborating in Slack! 🧠Refresher course: xLAM is #Salesforce’s family of Large Action models custom built for function

Andrew Lampinen (@andrewlampinen) 's Twitter Profile Photo

New preprint! In “Naturalistic Computational Cognitive Science: Towards generalizable models and theories that capture the full range of natural behavior” we synthesize AI and cognitive science works into a perspective on pursuing generalizable understanding of cognition. Thread:

New preprint! In “Naturalistic Computational Cognitive Science: Towards generalizable models and theories that capture the full range of natural behavior” we synthesize AI and cognitive science works into a perspective on pursuing generalizable understanding of cognition. Thread:
Akshara Prabhakar (@aksh_555) 's Twitter Profile Photo

🚀 Just dropped APIGen-MT-5k — 5K high-quality multi-turn agent interactions, generated with our APIGen-MT framework! Built for training & evaluating AI agents.

Silvio Savarese (@silviocinguetta) 's Twitter Profile Photo

Enterprise General Intelligence (EGI) won't require bigger models—it will demand better data! Our recent research demonstrates that smaller models (like our own xLAM-2) trained on high-quality multi-turn interaction data outperform frontier models like GPT-4o and Claude 3.5 in

Enterprise General Intelligence (EGI) won't require bigger models—it will demand better data! Our recent research demonstrates that smaller models (like our own xLAM-2) trained on high-quality multi-turn interaction data outperform frontier models like GPT-4o and Claude 3.5 in
Salesforce (@salesforce) 's Twitter Profile Photo

.Salesforce AI Research’s new series “AI Research Lab - Explained” just dropped! First up? See how we fine-tune specialized models to predict actions, not just language—enabling faster, more precise execution of real-world tasks. ⏯️ Watch and subscribe on YouTube: youtube.com/watch?v=vlvv4Z…

Salesforce AI Research (@sfresearch) 's Twitter Profile Photo

🚨 Introducing CRMArena-Pro: The first multi-turn, enterprise-grade benchmark for LLM agents ✍️Blog: sforce.co/4dKBRIq 🖇️Paper: bit.ly/3T0AY4E 🤗Dataset: bit.ly/4kiRlG3 🖥️Code: bit.ly/4fkrZVM Most AI benchmarks test isolated, single-turn tasks.

🚨 Introducing CRMArena-Pro: The first multi-turn, enterprise-grade benchmark for LLM agents

✍️Blog: sforce.co/4dKBRIq
🖇️Paper: bit.ly/3T0AY4E
🤗Dataset: bit.ly/4kiRlG3
🖥️Code: bit.ly/4fkrZVM

Most AI benchmarks test isolated, single-turn tasks.
Quantumbytz (@quantumbytz) 's Twitter Profile Photo

Salesforce AI Introduces CRMArena-Pro: The First Multi-Turn and Enterprise-Grade Benchmark for LLM Agents #AI #MachineLearning #IoT #LLM marktechpost.com/2025/06/05/sal…...

Akshara Prabhakar (@aksh_555) 's Twitter Profile Photo

🤖 𝘋𝘰𝘪𝘯𝘨 𝘵𝘩𝘦 𝘵𝘢𝘴𝘬 ≠ 𝘶𝘯𝘥𝘦𝘳𝘴𝘵𝘢𝘯𝘥𝘪𝘯𝘨 𝘵𝘩𝘦 𝘶𝘴𝘦𝘳 ✅ Agents complete tasks ❌ But rarely listen, adapt or align Check out our intern Cheng Qian's 𝐔𝐬𝐞𝐫𝐁𝐞𝐧𝐜𝐡, a benchmark showing even top LLMs align with <𝟑𝟎% of user preferences! ⬇️

Cheng Qian (@qiancheng1231) 's Twitter Profile Photo

🚀 Introducing UserRL: a new framework to train agents that truly assist users through proactive interaction, not just chase static benchmarking scores. 📄 Paper: arxiv.org/pdf/2509.19736 💻 Code: github.com/SalesforceAIRe…

🚀 Introducing UserRL: a new framework to train agents that truly assist users through proactive interaction, not just chase static benchmarking scores.

 📄 Paper: arxiv.org/pdf/2509.19736
 💻 Code: github.com/SalesforceAIRe…
Zuxin Liu (@liuzuxin) 's Twitter Profile Photo

Agents shouldn’t guess what you want—they should ask when necessary. With UserRL, we built user-centric RL envs so agents clarify first, act second. After RL, even 4B models reliably infer preferences. We trained agents to understand you. Code/envs/data are open—Check it out! 🚀

Caiming Xiong (@caimingxiong) 's Twitter Profile Photo

The human heart is the hardest thing in the world to measure. How do you teach your agent to understand it? Check out our latest paper: UserRL: Training interactive user-centric agent via reinforcement learning. paper: arxiv.org/pdf/2509.19736 code: github.com/SalesforceAIRe…