Peter Jansen ( @peterjansen-ai.bsky.social ) (@peterjansen_ai) 's Twitter Profile
Peter Jansen ( @peterjansen-ai.bsky.social )

@peterjansen_ai

Associate Professor @uarizona; Visiting Scientist @allen_ai, AI/NLP; DiscoveryWorld; EntailmentBank; ScienceWorld; textgames.org list. Tweets/opinions my own

ID: 974390207867858944

linkhttp://cognitiveai.org calendar_today15-03-2018 21:01:43

5,5K Tweet

1,1K Followers

654 Following

Akshay 🚀 (@akshay_pachaar) 's Twitter Profile Photo

A RAG engine for deep document understanding! RAGFlow lets you build enterprise-grade RAG workflows on complex docs with well-founded citations. Supports multimodal data understanding, web search, deep research, etc. 100% local & open-source with 55k+ stars!

Lucas Caccia (@lucaspcaccia) 's Twitter Profile Photo

RAG and in-context learning are the go-to approaches for integrating new knowledge into LLMs, making inference very inefficient We propose instead 𝗞𝗻𝗼𝘄𝗹𝗲𝗱𝗴𝗲 𝗠𝗼𝗱𝘂𝗹𝗲𝘀 : lightweight LoRA modules trained offline that can match RAG performance without the drawbacks

Science of Science (@mishateplitskiy) 's Twitter Profile Photo

Verrrrry intriguing-looking and labor-intensive test of whether LLMs can come up with good scientific ideas. After implementing those ideas, the verdict seems to be "no, not really."

Verrrrry intriguing-looking and labor-intensive test of whether LLMs can come up with good scientific ideas. After implementing those ideas, the verdict seems to be "no, not really."
Anthropic (@anthropicai) 's Twitter Profile Photo

Anthropic staff realized they could ask Claude to buy things that weren’t just food & drink. After someone randomly decided to ask it to order a tungsten cube, Claude ended up with an inventory full of (as it put it) “specialty metal items” that it ended up selling at a loss.

Anthropic staff realized they could ask Claude to buy things that weren’t just food & drink. 

After someone randomly decided to ask it to order a tungsten cube, Claude ended up with an inventory full of (as it put it) “specialty metal items” that it ended up selling at a loss.
Ai2 (@allen_ai) 's Twitter Profile Photo

Today we’re releasing a prototype of Genesys, an autonomous multi-agent LLM discovery system that aims to discover new types of language model architectures. We found Genesys can discover novel architectures competitive with the industry-standard transformer. 🧵

Today we’re releasing a prototype of Genesys, an autonomous multi-agent LLM discovery system that aims to discover new types of language model architectures. We found Genesys can discover novel architectures competitive with the industry-standard transformer. 🧵
CLS (@chengleisi) 's Twitter Profile Photo

Are AI scientists already better than human researchers? We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts. Main finding: LLM ideas result in worse projects than human ideas.

Are AI scientists already better than human researchers?

We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts.

Main finding: LLM ideas result in worse projects than human ideas.
Ai2 (@allen_ai) 's Twitter Profile Photo

🚨 We're hiring a #ResearchScientist in #AI for Scientific Discovery at Ai2! Are you passionate about intelligent agents, data-driven discovery, and AI systems that accelerate science? Join us in shaping the future of research. 🧬🧠 Apply now: job-boards.greenhouse.io/thealleninstit…

🚨 We're hiring a #ResearchScientist in #AI for Scientific Discovery at Ai2!

Are you passionate about intelligent agents, data-driven discovery, and AI systems that accelerate science? Join us in shaping the future of research. 🧬🧠

Apply now: job-boards.greenhouse.io/thealleninstit…
Ai2 (@allen_ai) 's Twitter Profile Photo

Introducing SciArena, a platform for benchmarking models across scientific literature tasks. Inspired by Chatbot Arena, SciArena applies a crowdsourced LLM evaluation approach to the scientific domain. 🧵

Introducing SciArena, a platform for benchmarking models across scientific literature tasks. Inspired by Chatbot Arena, SciArena applies a crowdsourced LLM evaluation approach to the scientific domain. 🧵
James Zou (@james_y_zou) 's Twitter Profile Photo

Introducing Fractional Reasoning: a mechanistic method to quantitatively control how much thinking a LLM performs. tldr: we identify latent reasoning knobs in transformer embedding ➡️ better inference compute approach that mitigates under/over-thinking arxiv.org/pdf/2506.15882

Introducing Fractional Reasoning: a mechanistic method to quantitatively control how much thinking a LLM performs.

tldr: we identify latent reasoning knobs in transformer embedding ➡️ better inference compute approach that mitigates under/over-thinking arxiv.org/pdf/2506.15882
Thomas Wolf (@thom_wolf) 's Twitter Profile Photo

We are so excited to announce a new open-source challenge in collaboration with Proxima Fusion : unlocking fusion with AI If you haven't followed, fusion is how the sun make energy and is –in the long term– our best bet on a clean, safe, and virtually limitless energy In the

We are so excited to announce a new open-source challenge in collaboration with <a href="/proximafusion/">Proxima Fusion</a> : unlocking fusion with AI

If you haven't followed, fusion is how the sun make energy and is –in the long term– our best bet on a clean, safe, and virtually limitless energy

In the
Kexin Huang (@kexinhuang5) 's Twitter Profile Photo

🤝Excited to announce Biomni × Anthropic! AI agents are set to transform how biologists do everyday research. Thanks to this partnership, the platform is now free for scientists worldwide: biomni.stanford.edu Learn more: anthropic.com/customers/biom…

🤝Excited to announce <a href="/ProjectBiomni/">Biomni</a> × <a href="/AnthropicAI/">Anthropic</a>! 

AI agents are set to transform how biologists do everyday research.  Thanks to this partnership, the platform is now free for scientists worldwide: biomni.stanford.edu 

Learn more: anthropic.com/customers/biom…
James Zou (@james_y_zou) 's Twitter Profile Photo

📢New conference where AI is the primary author and reviewer! agents4science.stanford.edu Current venues don't allow AI-written papers, so it's hard to assess the +/- of such works🤔 #Agents4Science solicits papers where AI is the main author w/ human advisors. 💡Initial reviews by

📢New conference where AI is the primary author and reviewer! agents4science.stanford.edu

Current venues don't allow AI-written papers, so it's hard to assess the +/- of such works🤔 #Agents4Science solicits papers where AI is the main author w/ human advisors.

💡Initial reviews by
Keyon Vafa (@keyonv) 's Twitter Profile Photo

Can an AI model predict perfectly and still have a terrible world model? What would that even mean? Our new ICML paper formalizes these questions One result tells the story: A transformer trained on 10M solar systems nails planetary orbits. But it botches gravitational laws 🧵

Derek Thompson (@dkthomp) 's Twitter Profile Photo

Two weeks ago, Marco Rubio said USAID “has little to show since the end of the Cold War.” Days earlier, a Lancet study estimated that USAID global health programs have saved 90 million lives—not since 1991, but since just 2001.

Two weeks ago, Marco Rubio said USAID “has little to show since the end of the Cold War.”

Days earlier, a Lancet study estimated that USAID global health programs have saved 90 million lives—not since 1991, but since just 2001.
Tuhin Chakrabarty (@tuhinchakr) 's Twitter Profile Photo

Honored to get the outstanding position paper award at ICML Conference :) Come attend my talk and poster tomorrow on human centered considerations for a safer and better future of work I will be recruiting PhD students at Stony Brook University Stony Brook University Dept. of Computer Science coming fall. Please get in touch.

Honored to get the outstanding position paper award at <a href="/icmlconf/">ICML Conference</a> :) Come attend my talk and poster tomorrow on human centered considerations for a safer and better future of work

I will be recruiting PhD students at <a href="/stonybrooku/">Stony Brook University</a> <a href="/sbucompsc/">Stony Brook University Dept. of Computer Science</a> coming fall. Please get in touch.
Ai2 (@allen_ai) 's Twitter Profile Photo

We’ve upgraded ScholarQA, our agent that helps researchers conduct literature reviews efficiently by providing detailed answers. Now, when ScholarQA cites a source, it won’t just tell you which paper it came from–you’ll see the exact quote, highlighted in the original PDF. 🧵

We’ve upgraded ScholarQA, our agent that helps researchers conduct literature reviews efficiently by providing detailed answers. Now, when ScholarQA cites a source, it won’t just tell you which paper it came from–you’ll see the exact quote, highlighted in the original PDF. 🧵