Vishruth Veerendranath (@viishruth) 's Twitter Profile
Vishruth Veerendranath

@viishruth

ML/NLP Master’s @LTIatCMU

ID: 3274985820

linkhttps://vishruth-v.github.io/ calendar_today11-07-2015 01:33:20

125 Tweet

223 Followers

803 Following

John Yang (@jyangballin) 's Twitter Profile Photo

We're launching SWE-bench Multimodal to eval agents' ability to solve visual GitHub issues. - 617 *brand new* tasks from 17 JavaScript repos - Each task has an image! Existing agents struggle here! We present SWE-agent Multimodal to remedy some issues Led w/ carlos 🧵

We're launching SWE-bench Multimodal to eval agents' ability to solve visual GitHub issues.
- 617 *brand new* tasks from 17 JavaScript repos
- Each task has an image!

Existing agents struggle here! We present SWE-agent Multimodal to remedy some issues
Led w/ <a href="/_carlosejimenez/">carlos</a>
🧵
Priyanshu Kumar (@kpriyanshu256) 's Twitter Profile Photo

Thrilled to share BrowserART, a test suite tailored for red-teaming browser agents! tl;dr: Aligned LLMs are *not* aligned browser agents. Website: scale.com/research/brows…

Vishruth Veerendranath (@viishruth) 's Twitter Profile Photo

I’m attending #EMNLP2024 in Miami from 11-16th Nov to present ECCO on Tuesday 🏖️ Looking forward to meeting folks and chatting more about code generation and LLM agents!

Language Technologies Institute | @CarnegieMellon (@ltiatcmu) 's Twitter Profile Photo

As we prepare for EMNLP 2024 (EMNLP 2025), we're thrilled to congratulate all of the LTI researchers with accepted papers at this year's conference. In all, 32 papers with LTI authors were accepted! Read about them all here: lti.cs.cmu.edu/news-and-event…

Zora Wang (@zhiruow) 's Twitter Profile Photo

Sad to miss #EMNLP2024 but do check out our paper "ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness?" presented by Vishruth Veerendranath and Siddhant Tuesday 11-12:30 at Poster Session 02‼️

Language Technologies Institute | @CarnegieMellon (@ltiatcmu) 's Twitter Profile Photo

ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness? by Siddhant Waghjale, Vishruth Veerendranath, Zora Wang, and Daniel Fried Session: NLP Applications 1, Session 02, 11:00-12:30 arxiv.org/abs/2407.14044

Matthew Leavitt (@leavittron) 's Twitter Profile Photo

🧵We’ve spent the last few months at DatologyAI building a state-of-the-art data curation pipeline and I’m SO excited to share our first results: we curated image-text pretraining data and massively improved CLIP model quality, training speed, and inference efficiency 🔥🔥🔥

Matthew Leavitt (@leavittron) 's Twitter Profile Photo

Tired: Bringing up politics at Thanksgiving Wired: Bringing up DatologyAI’s new text curation results at Thanksgiving That’s right, we applied our data curation pipeline to text pretraining data and the results are hot enough to roast a 🦃 🧵

Daniel Campos (@spacemanidol) 's Twitter Profile Photo

🚀 I am thrilled to introduce SnowflakeDB 's Arctic Embed 2.0 embedding models! 2.0 offers high-quality multilingual performance with all the greatness of our prior embedding models (MRL, Apache-2 license, great English retrieval, inference efficiency) snowflake.com/engineering-bl…🌍

Aurick Qiao (@aurickq) 's Twitter Profile Photo

We are excited to share SwiftKV, our recent work at SnowflakeDB AI Research! SwiftKV reduces the pre-fill compute for enterprise LLM inference by up to 2x, resulting in higher serving throughput for input-heavy workloads. 🧵

We are excited to share SwiftKV, our recent work at <a href="/SnowflakeDB/">SnowflakeDB</a> AI Research! SwiftKV reduces the pre-fill compute for enterprise LLM inference by up to 2x, resulting in higher serving throughput for input-heavy workloads. 🧵
Zora Wang (@zhiruow) 's Twitter Profile Photo

Excited to co-organized the DL4C workshop at ICLR 2026'25. Check out our call for papers and submit your interesting codegen paper! 😉

Arjun Choudhry (@arjun_7m) 's Twitter Profile Photo

Excited to share TimeSeriesExam for systematic evaluation of time series reasoning capabilities of LLMs. Think your LLM can reason on time series concepts? Take it for a spin on the TimeSeriesExam! Now publicly available on HuggingFace :)

Graham Neubig (@gneubig) 's Twitter Profile Photo

How far are we from having competent AI co-workers that can perform tasks as varied as software development, project management, administration, and data science? In our new paper, we introduce TheAgentCompany, a benchmark for AI agents on consequential real-world tasks.

How far are we from having competent AI co-workers that can perform tasks as varied as software development, project management, administration, and data science?

In our new paper, we introduce TheAgentCompany, a benchmark for AI agents on consequential real-world tasks.
Snowflake (@snowflakedb) 's Twitter Profile Photo

Snowflake Cortex Agents, now in public preview! Cortex Agents orchestrates across structured and unstructured data for accurate AI-driven decisions from within the secure Snowflake perimeter; Cortex Agents use Cortex Analyst (now in GA) and Cortex Search as tools.

Snowflake Cortex Agents, now in public preview!

Cortex Agents orchestrates across structured and unstructured data for accurate AI-driven decisions from within the secure Snowflake perimeter; Cortex Agents use Cortex Analyst (now in GA) and Cortex Search as tools.
Rajhans Samdani (@rajhans_samdani) 's Twitter Profile Photo

🧵 1/ Recent hype suggests long-context LLMs remove the need for retrieval in RAG pipelines—"just put all your docs in context." We tested this theory rigorously in finance, focusing on SEC filings. Spoiler: Retrieval & chunking strategies still dominate.

🧵 1/ Recent hype suggests long-context LLMs remove the need for retrieval in RAG pipelines—"just put all your docs in context." We tested this theory rigorously in finance, focusing on SEC filings.

Spoiler: Retrieval &amp; chunking strategies still dominate.
Rajhans Samdani (@rajhans_samdani) 's Twitter Profile Photo

Another banger from my group that tbh raises more questions about creating data agents then answers. Here's the core issue: When creating an agent to query structured & unstructured data for business insights, how do you describe these data tools? Let me elaborate 🧵👇

Zora Wang (@zhiruow) 's Twitter Profile Photo

Meet ASI: Agent Skill Induction A framework for online programmatic skill learning — no offline data, no training. 🧠 Build reusable skills during test 📈 +23.5% success, +15.3% efficiency 🌐 Scales to long-horizon tasks, transfers across websites Let's dive in! 🧵

Meet ASI: Agent Skill Induction
A framework for online programmatic skill learning — no offline data, no training.
🧠 Build reusable skills during test
📈 +23.5% success, +15.3% efficiency
🌐 Scales to long-horizon tasks, transfers across websites

Let's dive in! 🧵
Zora Wang (@zhiruow) 's Twitter Profile Photo

Cannot attend #ICLR2025 in person (will be NAACL and Stanford soon!), but do check out 👇 ▪️Apr 27: "Exploring the Pre-conditions for Memory-Learning Agents" led by Vishruth Veerendranath and Vishwa Shah, at SSI-FM workshop ▪️Apr 28: our Deep Learning For Code @ NeurIPS'25 workshop with a fantastic line of works &

Zora Wang (@zhiruow) 's Twitter Profile Photo

Excited to share that AWM has been accepted at #ICML2025 🥳 Check out our online memory-adaptive agent if you haven't! 🔗arxiv.org/abs/2409.07429