Rogerio Feris (@rogerioferis) Twitter Tweets • TwiCopy

James Smith

3 years ago

Happy to share that we had two papers accepted to #CVPR2023! Both are on continual adaptation of pre-trained models (ViT for image classification and BLIP for NLVR). More details (and code) will be coming soon! arxiv.org/abs/2211.13218 arxiv.org/abs/2211.09790

thumb_up_off_alt94

chat_bubble_outline10

repeat15

shareShare

John Nay

@johnjnay

3 years ago

Multi-Task Prompt Tuning Enables Transfer Learning -Learn single prompt from multiple task-specific prompts -Learn multiplicative low rank updates to adapt it to tasks Parameter-efficient & state-of-the-art performance across diverse NLP tasks Paper: arxiv.org/abs/2303.02861

thumb_up_off_alt166

chat_bubble_outline2

repeat46

shareShare

Rogerio Feris

@rogerioferis

3 years ago

We are looking for a summer intern (MSc/PhD) to work on large language models for sports & entertainment, with the goal of improving the experience of millions of fans as part of major tournaments (US Open/Wimbledon) @IBMSports MIT-IBM Watson AI Lab Apply at: krb-sjobs.brassring.com/TGnewUI/Search…

thumb_up_off_alt26

chat_bubble_outline0

repeat12

shareShare

MIT-IBM Watson AI Lab

@mitibmlab

3 years ago

New technique from the MIT-IBM Watson AI Lab and its collaborators learns to "grow" a larger machine-learning model from a smaller, pre-trained model, reducing the monetary and environmental cost of developing AI applications and with similar or improved performance. news.mit.edu/2023/new-techn…

New technique from the <a href="/MITIBMLab/">MIT-IBM Watson AI Lab</a> and its collaborators learns to "grow" a larger machine-learning model from a smaller, pre-trained model, reducing the monetary and environmental cost of developing AI applications and with similar or improved performance.
news.mit.edu/2023/new-techn…

thumb_up_off_alt18

chat_bubble_outline0

repeat4

shareShare

Dario Gil

@dariogila

3 years ago

We can all agree we’re at a unique and evolutionary moment in AI, with enterprises increasingly turning to this technology’s transformative power to unlock new levels of innovation and productivity. At #Think2023, IBM unveiled watsonx. Learn more: newsroom.ibm.com/2023-05-09-IBM…

thumb_up_off_alt122

chat_bubble_outline3

repeat51

shareShare

Dmitry Krotov

@dimakrotov

2 years ago

Recent advances in Hopfield networks of associative memory may be the guiding theoretical principle for designing novel large scale neural architectures. I explain my enthusiasm about these ideas in the article ⬇️⬇️⬇️. Please let me know what you think. nature.com/articles/s4225…

thumb_up_off_alt756

chat_bubble_outline9

repeat173

shareShare

Zexue He

@zexuehe

2 years ago

(1/3)🤔Wondering what's transferred between the pre-training and fine-tuning? Our ACL finding looks into this question with synthetic pre-training tasks for MT. Surprisingly, most pre-training benefits are realized even with 75% nonsense parallel corpus or purely synthetic data!

thumb_up_off_alt47

chat_bubble_outline3

repeat6

shareShare

Dmitry Krotov

@dimakrotov

2 years ago

What could be the computational function of astrocytes in the brain? We hypothesize that they may be the biological cells that could implement the Transformer's attention operation commonly used in AI. Much improved compared to an earlier preprint: pnas.org/doi/10.1073/pn…

thumb_up_off_alt271

chat_bubble_outline1

repeat79

shareShare

Junmo Kang

@junmokang

2 years ago

🚨Can we self-align LLMs with an expert domain like biomedicine with limited supervision? Introducing Self-Specialization, uncovering expertise latent within LLMs to boost their utility in specialized domains. arxiv.org/abs/2310.00160 Georgia Tech School of Interactive Computing Machine Learning at Georgia Tech MIT CSAIL MIT-IBM Watson AI Lab 1/8

thumb_up_off_alt78

chat_bubble_outline1

repeat24

shareShare

Yann LeCun

@ylecun

2 years ago

IBM & Meta are launching the AI Alliance to advance *open* & reliable AI. The list of over 50 founding members from industry, government, and academia include AMD, Anyscale, CERN, Hugging Face, the Linux Foundation, NASA.... ai.meta.com/blog/ai-allian…

thumb_up_off_alt4,4K

chat_bubble_outline151

repeat792

shareShare

Rogerio Feris

@rogerioferis

2 years ago

We have a cool challenge on understanding document images in our 2nd #CVPR2024 workshop on “What is Next in Multimodal Foundation Models?”, (sites.google.com/view/2nd-mmfm-…). This is a great opportunity to showcase your work in front of a large audience (pic below from our 1st workshop)

thumb_up_off_alt39

chat_bubble_outline0

repeat14

shareShare

Leonid Karlinsky

@leokarlin

a year ago

Thanks for the highlight AK! We offer a simple and nearly-data-free way to move (large quantities) of custom PEFT models within or across LLM families or even across PEFT configurations. Useful for LLM cloud hosting when old base models need to be deprecated & upgraded

thumb_up_off_alt23

chat_bubble_outline0

repeat6

shareShare

Wei Lin @ ECCV 2024

@weilincv

a year ago

Welcome to join our workshop to figure out what is next in Multimodal foundation models! Tuesday 08:30 Pacific Time, Summit 437-439 at Seattle Convention Center Summit🤖

thumb_up_off_alt7

chat_bubble_outline0

repeat2

shareShare

Nasim Borazjanizadeh

@nasimborazjani

a year ago

🚨 OpenAI's new o1 model scores only 38.2% in correctness on our new benchmark of combinatorial problems, SearchBench (arxiv.org/abs/2406.12172), while 57.1% is possible with GPT-4 and A* MSMT prompting! 🚨

thumb_up_off_alt13

chat_bubble_outline1

repeat2

shareShare

Yikang Shen

@yikang_shen

a year ago

Granite 3.0 is our latest update for the IBM foundation models. The 8B and 2B models outperform strong competitors with similar sizes. The 1B and 3B MoE use only 400M and 800M active parameters to target the on-device use cases. Our technical report provides all the details you

thumb_up_off_alt98

chat_bubble_outline9

repeat29

shareShare

Chancharik Mitra

@chancharikm

10 months ago

🎯 Introducing Sparse Attention Vectors (SAVs): A breakthrough method for extracting powerful multimodal features from Large Multimodal Models (LMMs). SAVs enable SOTA performance on discriminative vision-language tasks (classification, safety alignment, etc.)! Links in replies!

thumb_up_off_alt152

chat_bubble_outline5

repeat40

shareShare

Yu Wang

@__yuwang__

6 months ago

🎉 Our paper “M+: Extending MemoryLLM with Scalable Long-Term Memory” is accepted to ICML 2025! 🔹 Co-trained retriever + latent memory 🔹 Retains info across 160k+ tokens 🔹 Much Lower GPU cost compared to backbone LLM arxiv.org/abs/2502.00592

thumb_up_off_alt135

chat_bubble_outline3

repeat22

shareShare

Memory and Vision Workshop

@memvis_iccv25

4 months ago

MemVis @ #ICCV2025 -- 1st Workshop on Memory & Vision! 🧠👁️ Call for papers now open: Hopfield & energy nets, state-space + diffusion models, retrieval & lifelong learning, long-context FMs, multimodal memory, & more. 🗓️ Submit by 1 Aug 2025 → sites.google.com/view/memvis-ic… 🌺 #MemVis

thumb_up_off_alt11

chat_bubble_outline0

repeat3

shareShare