Ramakanth (@ramakanth1729) 's Twitter Profile
Ramakanth

@ramakanth1729

Research Scientist @MetaAI

ID: 910945363

linkhttp://rama-kanth.com calendar_today28-10-2012 19:19:33

294 Tweet

471 Followers

443 Following

Swarnadeep Saha (@swarnanlp) 's Twitter Profile Photo

New paper๐Ÿšจ arxiv.org/abs/2212.08607 Can we build fluent, factual & logical text generation systems w/ multi-step reasoning over semi-structured data (tables/graphs)? We propose MURMUR, a neuro-symbolic method w/ symbolic modules for logical skills & LLMs for linguistic skills๐Ÿงต

New paper๐Ÿšจ arxiv.org/abs/2212.08607

Can we build fluent, factual & logical text generation systems w/ multi-step reasoning over semi-structured data (tables/graphs)?

We propose MURMUR, a neuro-symbolic method w/ symbolic modules for logical skills & LLMs for linguistic skills๐Ÿงต
Asli Celikyilmaz (@real_asli) 's Twitter Profile Photo

Introducing our new work, MURMUR, a neuro-symbolic modular reasoning approach to generate logical, faithful and diverse summaries from structured inputs w/ Swarnadeep Saha , Xinyan Velocity Yu , Mohit Bansal , Ramakanth

Mengzhou Xia (@xiamengzhou) 's Twitter Profile Photo

How do language models of different sizes learn during the course of pre-training? We study the training trajectories with training checkpoints of language model from 125M to 175B for a better understanding! Check out our new paper ๐Ÿ“œ: arxiv.org/abs/2212.09803 (1/N)

AI at Meta (@aiatmeta) 's Twitter Profile Photo

Announcing OPT-IML: a new language model from Meta AI with 175B parameters, fine-tuned on 2,000 language tasks โ€” openly available soon under a noncommercial license for research use cases. Research paper & more details on GitHub โฌ‡๏ธ

Srini Iyer (@sriniiyer88) 's Twitter Profile Photo

Exciting updates on OPT-IML (arxiv.org/abs/2212.12017)! 1. The 175B models are now available to request for research purposes. Request here: (github.com/facebookresearโ€ฆ). 2. OPT-IML 30B and 1.3B models are now available on huggingface (huggingface.co/facebook/opt-iโ€ฆ)!

Xi Ye (@xiye_nlp) 's Twitter Profile Photo

Our paper on ๐ฉ๐ซ๐จ๐ฆ๐ฉ๐ญ๐ข๐ง๐  ๐‹๐‹๐Œ๐ฌ ๐ฐ๐ข๐ญ๐ก ๐ž๐ฑ๐ฉ๐ฅ๐š๐ง๐š๐ญ๐ข๐จ๐ง๐ฌ is accepted at Findings of #ACL2023NLP Check out our poster at the ๐๐‹๐‘๐’๐„ ๐ฐ๐จ๐ซ๐ค๐ฌ๐ก๐จ๐ฉ ๐จ๐ง ๐“๐ก๐ฎ๐ซ๐ฌ๐๐š๐ฒ. I won't be there in person๐Ÿ˜ข but you can say hi to my advisor Greg Durrett ๐Ÿ˜†

Our paper on ๐ฉ๐ซ๐จ๐ฆ๐ฉ๐ญ๐ข๐ง๐  ๐‹๐‹๐Œ๐ฌ ๐ฐ๐ข๐ญ๐ก ๐ž๐ฑ๐ฉ๐ฅ๐š๐ง๐š๐ญ๐ข๐จ๐ง๐ฌ is accepted at Findings of #ACL2023NLP
Check out our poster at the ๐๐‹๐‘๐’๐„ ๐ฐ๐จ๐ซ๐ค๐ฌ๐ก๐จ๐ฉ ๐จ๐ง ๐“๐ก๐ฎ๐ซ๐ฌ๐๐š๐ฒ. I won't be there in person๐Ÿ˜ข but you can say hi to my advisor <a href="/gregd_nlp/">Greg Durrett</a> ๐Ÿ˜†
Armen Aghajanyan (@armenagha) 's Twitter Profile Photo

Iโ€™m excited to release our most recent work setting a new SOTA FID of 4.88 on text-to-image generation we call CM3Leon (pronounced chameleon)! ai.meta.com/research/publiโ€ฆ

Iโ€™m excited to release our most recent work setting a new SOTA FID of 4.88 on text-to-image generation we call CM3Leon (pronounced chameleon)! ai.meta.com/research/publiโ€ฆ
Jiacheng Liu (@liujc1998) 's Twitter Profile Photo

What if we combine PPO with Monte-Carlo Tree Search โ€“ the secret sauce for AlphaGo to reach superhuman performance? Spoiler: MAGIC!! Our inference-time decoding method, PPO-MCTS, achieves impressive results across many text generation tasks. ๐Ÿ“œ arxiv.org/abs/2309.15028 ๐Ÿงต(1/n)

Howard Chen (@__howardchen) 's Twitter Profile Photo

Long context models are popular, but is it the final solution to long text reading? We introduce a fundamentally different method, MemWalker: 1. Build a data structure (memory tree) 2. Traverse it via LLM prompting Outperforms long context, retrieval, & recurrent baselines. (1/n)

Long context models are popular, but is it the final solution to long text reading?
We introduce a fundamentally different method, MemWalker:
1. Build a data structure (memory tree)
2. Traverse it via LLM prompting
Outperforms long context, retrieval, &amp; recurrent baselines. (1/n)
Jiacheng Liu (@liujc1998) 's Twitter Profile Photo

Introducing ๐Ÿ”ฎCrystal๐Ÿ”ฎ, an LM that conducts โ€œintrospective reasoningโ€ and shows its reasoning process for QA. This improves both QA accuracy and human interpretability => Reasoning made Crystal clear! arxiv.org/abs/2310.04921 Demo: hf.co/spaces/liujch1โ€ฆ at #EMNLP2023 ๐Ÿงต(1/n)

Introducing ๐Ÿ”ฎCrystal๐Ÿ”ฎ, an LM that conducts โ€œintrospective reasoningโ€ and shows its reasoning process for QA. This improves both QA accuracy and human interpretability =&gt; Reasoning made Crystal clear!

arxiv.org/abs/2310.04921
Demo: hf.co/spaces/liujch1โ€ฆ
at #EMNLP2023

๐Ÿงต(1/n)
Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

The ART of LLM Refinement: Ask, Refine, and Trust Achieves a performance gain of 5 points over self-refinement baselines, while using a much smaller model as the decision maker arxiv.org/abs/2311.07961

The ART of LLM Refinement: Ask, Refine, and Trust

Achieves a performance gain of 5 points over self-refinement baselines, while using a much smaller model as the decision maker

arxiv.org/abs/2311.07961
AI at Meta (@aiatmeta) 's Twitter Profile Photo

Newly published work from FAIR, Chameleon: Mixed-Modal Early-Fusion Foundation Models. This research presents a family of early-fusion token-based mixed-modal models capable of understanding & generating images & text in any arbitrary sequence. Paper โžก๏ธ go.fb.me/7rb19n

Newly published work from FAIR, Chameleon: Mixed-Modal Early-Fusion Foundation Models.

This research presents a family of early-fusion token-based mixed-modal models capable of understanding &amp; generating images &amp; text in any arbitrary sequence.

Paper โžก๏ธ go.fb.me/7rb19n
Srini Iyer (@sriniiyer88) 's Twitter Profile Photo

Excited to release our work from last year showcasing a stable training recipe for fully token-based multi-modal early-fusion auto-regressive models! arxiv.org/abs/2405.09818 Huge shout out to Armen Aghajanyan Ramakanth Luke Zettlemoyer Gargi Ghosh and other co-authors. (1/n)

AI at Meta (@aiatmeta) 's Twitter Profile Photo

Today is a good day for open science. As part of our continued commitment to the growth and development of an open ecosystem, today at Meta FAIR weโ€™re announcing four new publicly available AI models and additional research artifacts to inspire innovation in the community and

Srini Iyer (@sriniiyer88) 's Twitter Profile Photo

Super excited to open-source Chameleon 7B and 34B model weights today. These early-fusion models can understand and generate any sequence of interleaved images and text! Image-gen capabilities are masked for safety reasons, but everything else is enabled. Happy fine-tuning!

Artidoro Pagnoni (@artidoropagnoni) 's Twitter Profile Photo

๐Ÿš€ Introducing the Byte Latent Transformer (BLT) โ€“ An LLM architecture that scales better than Llama 3 using byte-patches instead of tokens ๐Ÿคฏ Paper ๐Ÿ“„ dl.fbaipublicfiles.com/blt/BLT__Patchโ€ฆ Code ๐Ÿ› ๏ธ github.com/facebookresearโ€ฆ

๐Ÿš€ Introducing the Byte Latent Transformer (BLT) โ€“ An LLM architecture that scales better than Llama 3 using byte-patches instead of tokens ๐Ÿคฏ 

Paper ๐Ÿ“„ dl.fbaipublicfiles.com/blt/BLT__Patchโ€ฆ
Code ๐Ÿ› ๏ธ github.com/facebookresearโ€ฆ
Srini Iyer (@sriniiyer88) 's Twitter Profile Photo

BLT model weights are out! Responding to popular demand, we just open-sourced model weights for our 1B and 8B BLT models for the research community to play with! huggingface.co/facebook/blt Hoping to see many new and improved BLT based architectures this year!

Gargi Ghosh (@gargighosh) 's Twitter Profile Photo

New research from FAIR- Active Reading: a framework to learn a given set of material with self-generated learning strategies for generalized and expert domains(such as Finance). Absorb significantly more knowledge than vanilla finetuning and usual data augmentations strategies

Ramakanth (@ramakanth1729) 's Twitter Profile Photo

Meet HoneyBee: Our new 2.5M sample multi-modal reasoning dataset. It outperforms InternVL2.5/3-Instruct and Qwen2.5-VL-Instruct. More details in this post!

Jessy Lin (@realjessylin) 's Twitter Profile Photo

๐Ÿง  How can we equip LLMs with memory that allows them to continually learn new things? In our new paper with AI at Meta, we show how sparsely finetuning memory layers enables targeted updates for continual learning, w/ minimal interference with existing knowledge. While full

๐Ÿง  How can we equip LLMs with memory that allows them to continually learn new things?

In our new paper with <a href="/AIatMeta/">AI at Meta</a>, we show how sparsely finetuning memory layers enables targeted updates for continual learning, w/ minimal interference with existing knowledge.

While full