Jinyan Su (on job market) (@sujinyan6) 's Twitter Profile
Jinyan Su (on job market)

@sujinyan6

PhD @Cornell; LLM personalization; RAG; LLM reasoning

ID: 1418105627352584193

linkhttp://jinyansu1.github.io calendar_today22-07-2021 07:08:34

13 Tweet

187 Followers

169 Following

Chenghao Yang (@chrome1996) 's Twitter Profile Photo

Happy Thanksgiving! Inspired by many great bloggers Sasha Rush Yao Fu, I made a tutorial about the "inference-time compute" tech showcased by O1. I incorporate insights from Sasha's great talk and ongoing O1 replications. Video: youtu.be/_Bw5o55SRL8. Feedback welcome!

Samuel Marks (@saprmarks) 's Twitter Profile Photo

What can AI researchers do *today* that AI developers will find useful for ensuring the safety of future advanced AI systems? To ring in the new year, the Anthropic Alignment Science team is sharing some thoughts on research directions we think are important.

What can AI researchers do *today* that AI developers will find useful for ensuring the safety of future advanced AI systems? To ring in the new year, the Anthropic Alignment Science team is sharing some thoughts on research directions we think are important.
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) (@teortaxestex) 's Twitter Profile Photo

If you can only read one DeepSeek paper in your life, read DeepSeek Math. Everything else is either ≈obvious in hindsight or clever optimization. DeepSeek Math is a tour de force of data engineering, general DL LLM methodology, RL, and just beautiful. Just 22 pages.

Niklas Muennighoff (@muennighoff) 's Twitter Profile Photo

Last week we released s1 - our simple recipe for sample-efficient reasoning & test-time scaling. We’re releasing 𝐬𝟏.𝟏 trained on the 𝐬𝐚𝐦𝐞 𝟏𝐊 𝐪𝐮𝐞𝐬𝐭𝐢𝐨𝐧𝐬 but performing much better by using r1 instead of Gemini traces. 60% on AIME25 I. Details in 🧵1/9

Last week we released s1 - our simple recipe for sample-efficient reasoning & test-time scaling. 

We’re releasing 𝐬𝟏.𝟏 trained on the 𝐬𝐚𝐦𝐞 𝟏𝐊 𝐪𝐮𝐞𝐬𝐭𝐢𝐨𝐧𝐬 but performing much better by using r1 instead of Gemini traces. 60% on AIME25 I.

Details in 🧵1/9
Infini-AI-Lab (@infiniailab) 's Twitter Profile Photo

🚀 RAG vs. Long-Context LLMs: The Real Battle ⚔️ 🤯Turns out, simple-to-build RAG can match million-dollar long-context LLMs (LC LLMs) on most existing benchmarks. 🤡So, do we even need long-context models? YES. Because today’s benchmarks are flawed: ⛳ Too Simple –

🚀 RAG vs. Long-Context LLMs: The Real Battle ⚔️ 
🤯Turns out, simple-to-build RAG can match million-dollar long-context LLMs (LC LLMs) on most existing benchmarks. 
🤡So, do we even need long-context models? 

YES. Because today’s benchmarks are flawed:
⛳ Too Simple –
Hao AI Lab (@haoailab) 's Twitter Profile Photo

Reasoning models often waste tokens self-doubting. Dynasor saves you up to 81% tokens to arrive at the correct answer! 🧠✂️ - Probe the model halfway to get the certainty - Use Certainty to stop reasoning - 100% Training-Free, Plug-and-play 🎮Demo: hao-ai-lab.github.io/demo/dynasor-c…

Xuandong Zhao (@xuandongzhao) 's Twitter Profile Photo

🚀 Highly recommend checking out The Future of Language Models and Transformers workshops hosted by the Simons Institute for the Theory of Computing at UC Berkeley! This is an incredible opportunity to learn about cutting-edge LLM research directly from some of the most renowned experts in the field.

🚀 Highly recommend checking out The Future of Language Models and Transformers workshops hosted by the <a href="/SimonsInstitute/">Simons Institute for the Theory of Computing</a> at <a href="/UCBerkeley/">UC Berkeley</a>! 

This is an incredible opportunity to learn about cutting-edge LLM research directly from some of the most renowned experts in the field.
Lilian Weng (@lilianweng) 's Twitter Profile Photo

Giving your models more time to think before prediction, like via smart decoding, chain-of-thoughts reasoning, latent thoughts, etc, turns out to be quite effective for unblocking the next level of intelligence. New post is here :) “Why we think”: lilianweng.github.io/posts/2025-05-…