Rogerio Feris (@rogerioferis) 's Twitter Profile
Rogerio Feris

@rogerioferis

Principal scientist and manager at the MIT-IBM Watson AI Lab

ID: 1226554235770298369

linkhttp://rogerioferis.org calendar_today09-02-2020 17:11:45

47 Tweet

1,1K Followers

353 Following

James Smith (@jamessealesmith) 's Twitter Profile Photo

Happy to share that we had two papers accepted to #CVPR2023! Both are on continual adaptation of pre-trained models (ViT for image classification and BLIP for NLVR). More details (and code) will be coming soon! arxiv.org/abs/2211.13218 arxiv.org/abs/2211.09790

Happy to share that we had two papers accepted to #CVPR2023! Both are on continual adaptation of pre-trained models (ViT for image classification and BLIP for NLVR).

More details (and code) will be coming soon! 

arxiv.org/abs/2211.13218
arxiv.org/abs/2211.09790
John Nay (@johnjnay) 's Twitter Profile Photo

Multi-Task Prompt Tuning Enables Transfer Learning -Learn single prompt from multiple task-specific prompts -Learn multiplicative low rank updates to adapt it to tasks Parameter-efficient & state-of-the-art performance across diverse NLP tasks Paper: arxiv.org/abs/2303.02861

Multi-Task Prompt Tuning Enables Transfer Learning

-Learn single prompt from multiple task-specific prompts
-Learn multiplicative low rank updates to adapt it to tasks

Parameter-efficient & state-of-the-art performance across diverse NLP tasks

Paper: arxiv.org/abs/2303.02861
Rogerio Feris (@rogerioferis) 's Twitter Profile Photo

We are looking for a summer intern (MSc/PhD) to work on large language models for sports & entertainment, with the goal of improving the experience of millions of fans as part of major tournaments (US Open/Wimbledon) @IBMSports MIT-IBM Watson AI Lab Apply at: krb-sjobs.brassring.com/TGnewUI/Search…

MIT-IBM Watson AI Lab (@mitibmlab) 's Twitter Profile Photo

New technique from the MIT-IBM Watson AI Lab and its collaborators learns to "grow" a larger machine-learning model from a smaller, pre-trained model, reducing the monetary and environmental cost of developing AI applications and with similar or improved performance. news.mit.edu/2023/new-techn…

New technique from the <a href="/MITIBMLab/">MIT-IBM Watson AI Lab</a> and its collaborators learns to "grow" a larger machine-learning model from a smaller, pre-trained model, reducing the monetary and environmental cost of developing AI applications and with similar or improved performance.
news.mit.edu/2023/new-techn…
Dario Gil (@dariogila) 's Twitter Profile Photo

We can all agree we’re at a unique and evolutionary moment in AI, with enterprises increasingly turning to this technology’s transformative power to unlock new levels of innovation and productivity. At #Think2023, IBM unveiled watsonx. Learn more: newsroom.ibm.com/2023-05-09-IBM…

We can all agree we’re at a unique and evolutionary moment in AI, with enterprises increasingly turning to this technology’s transformative power to unlock new levels of innovation and productivity. At #Think2023, <a href="/IBM/">IBM</a> unveiled watsonx. Learn more: newsroom.ibm.com/2023-05-09-IBM…
Dmitry Krotov (@dimakrotov) 's Twitter Profile Photo

Recent advances in Hopfield networks of associative memory may be the guiding theoretical principle for designing novel large scale neural architectures. I explain my enthusiasm about these ideas in the article ⬇️⬇️⬇️. Please let me know what you think. nature.com/articles/s4225…

Zexue He (@zexuehe) 's Twitter Profile Photo

(1/3)🤔Wondering what's transferred between the pre-training and fine-tuning? Our ACL finding looks into this question with synthetic pre-training tasks for MT. Surprisingly, most pre-training benefits are realized even with 75% nonsense parallel corpus or purely synthetic data!

(1/3)🤔Wondering what's transferred between the pre-training and fine-tuning? Our ACL finding looks into this question with synthetic pre-training tasks for MT. Surprisingly, most pre-training benefits are realized even with 75% nonsense parallel corpus or purely synthetic data!
Dmitry Krotov (@dimakrotov) 's Twitter Profile Photo

What could be the computational function of astrocytes in the brain? We hypothesize that they may be the biological cells that could implement the Transformer's attention operation commonly used in AI. Much improved compared to an earlier preprint: pnas.org/doi/10.1073/pn…

Junmo Kang (@junmokang) 's Twitter Profile Photo

🚨Can we self-align LLMs with an expert domain like biomedicine with limited supervision? Introducing Self-Specialization, uncovering expertise latent within LLMs to boost their utility in specialized domains. arxiv.org/abs/2310.00160 Georgia Tech School of Interactive Computing Machine Learning at Georgia Tech MIT CSAIL MIT-IBM Watson AI Lab 1/8

🚨Can we self-align LLMs with an expert domain like biomedicine with limited supervision?

Introducing Self-Specialization, uncovering expertise latent within LLMs to boost their utility in specialized domains.

arxiv.org/abs/2310.00160

<a href="/ICatGT/">Georgia Tech School of Interactive Computing</a> <a href="/mlatgt/">Machine Learning at Georgia Tech</a> <a href="/MIT_CSAIL/">MIT CSAIL</a> <a href="/MITIBMLab/">MIT-IBM Watson AI Lab</a>

1/8
Yann LeCun (@ylecun) 's Twitter Profile Photo

IBM & Meta are launching the AI Alliance to advance *open* & reliable AI. The list of over 50 founding members from industry, government, and academia include AMD, Anyscale, CERN, Hugging Face, the Linux Foundation, NASA.... ai.meta.com/blog/ai-allian…

Rogerio Feris (@rogerioferis) 's Twitter Profile Photo

We have a cool challenge on understanding document images in our 2nd #CVPR2024 workshop on “What is Next in Multimodal Foundation Models?”, (sites.google.com/view/2nd-mmfm-…). This is a great opportunity to showcase your work in front of a large audience (pic below from our 1st workshop)

We have a cool challenge on understanding document images in our 2nd #CVPR2024 workshop on “What is Next in Multimodal Foundation Models?”, (sites.google.com/view/2nd-mmfm-…). This is a great opportunity to showcase your work in front of a large audience (pic below from our 1st workshop)
Leonid Karlinsky (@leokarlin) 's Twitter Profile Photo

Thanks for the highlight AK! We offer a simple and nearly-data-free way to move (large quantities) of custom PEFT models within or across LLM families or even across PEFT configurations. Useful for LLM cloud hosting when old base models need to be deprecated & upgraded

Wei Lin @ ECCV 2024 (@weilincv) 's Twitter Profile Photo

Welcome to join our workshop to figure out what is next in Multimodal foundation models! Tuesday 08:30 Pacific Time, Summit 437-439 at Seattle Convention Center Summit🤖

Nasim Borazjanizadeh (@nasimborazjani) 's Twitter Profile Photo

🚨 OpenAI's new o1 model scores only 38.2% in correctness on our new benchmark of combinatorial problems, SearchBench (arxiv.org/abs/2406.12172), while 57.1% is possible with GPT-4 and A* MSMT prompting! 🚨

Yikang Shen (@yikang_shen) 's Twitter Profile Photo

Granite 3.0 is our latest update for the IBM foundation models. The 8B and 2B models outperform strong competitors with similar sizes. The 1B and 3B MoE use only 400M and 800M active parameters to target the on-device use cases. Our technical report provides all the details you

Granite 3.0 is our latest update for the IBM foundation models. The 8B and 2B models outperform strong competitors with similar sizes. The 1B and 3B MoE use only 400M and 800M active parameters to target the on-device use cases. Our technical report provides all the details you
Chancharik Mitra (@chancharikm) 's Twitter Profile Photo

🎯 Introducing Sparse Attention Vectors (SAVs): A breakthrough method for extracting powerful multimodal features from Large Multimodal Models (LMMs). SAVs enable SOTA performance on discriminative vision-language tasks (classification, safety alignment, etc.)! Links in replies!

Yu Wang (@__yuwang__) 's Twitter Profile Photo

🎉 Our paper “M+: Extending MemoryLLM with Scalable Long-Term Memory” is accepted to ICML 2025! 🔹 Co-trained retriever + latent memory 🔹 Retains info across 160k+ tokens 🔹 Much Lower GPU cost compared to backbone LLM arxiv.org/abs/2502.00592

Memory and Vision Workshop (@memvis_iccv25) 's Twitter Profile Photo

MemVis @ #ICCV2025 -- 1st Workshop on Memory & Vision! 🧠👁️ Call for papers now open: Hopfield & energy nets, state-space + diffusion models, retrieval & lifelong learning, long-context FMs, multimodal memory, & more. 🗓️ Submit by 1 Aug 2025 → sites.google.com/view/memvis-ic… 🌺 #MemVis