Xingyu Fu (@xingyufu2) 's Twitter Profile
Xingyu Fu

@xingyufu2

PhD student @Penn @cogcomp. | Focused on Vision+Language | Previous: @MSFTResearch @AmazonScience B.S. @UofIllinois | โ›ณ๏ธ๐Ÿ˜บ

ID: 1305996908264075270

linkhttps://zeyofu.github.io/ calendar_today15-09-2020 22:28:30

116 Tweet

879 Followers

514 Following

Fei Wang (@fwang_nlp) 's Twitter Profile Photo

๐— ๐˜‚๐—ถ๐—ฟ๐—•๐—ฒ๐—ป๐—ฐ๐—ต is officially accepted at #ICLR2025! ๐ŸŽ‰ Recent VLMs/MLLMs such as LLaVA-OneVision, MM1.5, and MAmmoTH-VL have demonstrated significant progress on MuirBench.๐Ÿš€ Excited to see how MuirBench continues to drive the innovation of VLMs! #AI #MachineLearning #VLM

Sheng Zhang (@sheng_zh) 's Twitter Profile Photo

Muirbench has been accepted to #ICLR2025! ๐Ÿš€ Companies like Apple, TikTok, and Salesforce are already evaluating their LMMs on its multi-image setupโ€”a robust testbed for multimodal reasoning. GenAI needs more benchmarks like this.๐Ÿคฏ Kudos to Fei Wang, Xingyu Fu โœˆ๏ธ ICML25, and team! ๐Ÿ‘

Xiaodong Yu (@xiaodong_yu_126) 's Twitter Profile Photo

Check our new paper on long context understanding! We use AgenticLU to significantly improve base modelโ€™s long contex performance (+14.7% avg on several datasets) without any scaling in the real inference time!

Yushi Hu (@huyushi98) 's Twitter Profile Photo

Excited to see the image reasoning in o3 and o4-mini!!๐Ÿคฉ We introduced this idea a year ago in visual Sketchpad (visualsketchpad.github.io). Excited to see OpenAI baking this into their model through agentic RL. Great work! And yes, reasoning should be multimodal! Huge shoutout

Weijia Shi (@weijiashi2) 's Twitter Profile Photo

Our previous work showed that ๐œ๐ซ๐ž๐š๐ญ๐ข๐ง๐  ๐ฏ๐ข๐ฌ๐ฎ๐š๐ฅ ๐œ๐ก๐š๐ข๐งโ€‘๐จ๐Ÿโ€‘๐ญ๐ก๐จ๐ฎ๐ ๐ก๐ญ๐ฌ ๐ฏ๐ข๐š ๐ญ๐จ๐จ๐ฅ ๐ฎ๐ฌ๐ž significantly boosts GPTโ€‘4oโ€™s visual reasoning performance. Excited to see this idea incorporated into OpenAIโ€™s o3 and o4โ€‘mini models (openai.com/index/thinkingโ€ฆ).

Yu Feng (@anniefeng6) 's Twitter Profile Photo

#ICLR2025 Oral LLMs often struggle with reliable and consistent decisions under uncertainty ๐Ÿ˜ตโ€๐Ÿ’ซ โ€” largely because they can't reliably estimate the probability of each choice. We propose BIRD ๐Ÿฆ, a framework that significantly enhances LLM decision making under uncertainty. BIRD

#ICLR2025 Oral

LLMs often struggle with reliable and consistent decisions under uncertainty ๐Ÿ˜ตโ€๐Ÿ’ซ โ€” largely because they can't reliably estimate the probability of each choice.

We propose BIRD ๐Ÿฆ, a framework that significantly enhances LLM decision making under uncertainty.

BIRD
Sayak Paul (@risingsayak) 's Twitter Profile Photo

Embedding a scientific basis in pre-trained T2I models can enhance the realism and consistency of the results. Cool work in "Science-T2I: Addressing Scientific Illusions in Image Synthesis" jialuo-li.github.io/Science-T2I-Weโ€ฆ

Embedding a scientific basis in pre-trained T2I models can enhance the realism and consistency of the results. 

Cool work in "Science-T2I: Addressing Scientific Illusions in Image Synthesis"

jialuo-li.github.io/Science-T2I-Weโ€ฆ
Jialuo Li (@jialuoli1007) 's Twitter Profile Photo

๐Ÿš€ Introducing Science-T2I - Towards bridging the gap between AI imagination and scientific reality in image generation! [CVPR 2025] ๐Ÿ“œ Paper: arxiv.org/abs/2504.13129 ๐ŸŒ Project: jialuo-li.github.io/Science-T2I-Web ๐Ÿ’ป Code: github.com/Jialuo-Li/Scieโ€ฆ ๐Ÿค— Dataset: huggingface.co/collections/Jiโ€ฆ ๐Ÿ”

๐Ÿš€ Introducing Science-T2I - Towards bridging the gap between AI imagination and scientific reality in image generation!  [CVPR 2025]  

๐Ÿ“œ Paper: arxiv.org/abs/2504.13129
๐ŸŒ Project: jialuo-li.github.io/Science-T2I-Web
๐Ÿ’ป Code: github.com/Jialuo-Li/Scieโ€ฆ
๐Ÿค— Dataset: huggingface.co/collections/Jiโ€ฆ

๐Ÿ”
Lucas Beyer (bl16) (@giffmana) 's Twitter Profile Photo

This paper is interestingly thought- provoking for me. There is a chance, that it's easier to "align t2i model with real physics" in post-training. And let it learn to generate whatever (physically implausible) combinations in pretrain. As opposed to trying hard to come up with

Fei Wang (@fwang_nlp) 's Twitter Profile Photo

๐ŸŽ‰ Excited to share that our paper, "MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding", will be presented at #ICLR2025!โ€‹ ๐Ÿ“… Date: April 24 ๐Ÿ•’ Time: 3:00 PM ๐Ÿ“ Location: Hall 3 + Hall 2B #11 MuirBench challenges multimodal LLMs with diverse multi-image

๐ŸŽ‰ Excited to share that our paper, "MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding", will be presented at #ICLR2025!โ€‹
๐Ÿ“… Date: April 24
๐Ÿ•’ Time: 3:00 PM
๐Ÿ“ Location: Hall 3 + Hall 2B #11
MuirBench challenges multimodal LLMs with diverse multi-image
Xingyu Fu (@xingyufu2) 's Twitter Profile Photo

Refocus๐Ÿ” Visual reasoning for Tables and Charts with Edits Happy to share ReFocus accepted at #ICML2025. Weโ€™ve open-sourced code and training data: zeyofu.github.io/ReFocus/ ReFocus enables multimodal LMs to better reason on Tables and Charts with visual edits. It also provides

Xingyu Fu (@xingyufu2) 's Twitter Profile Photo

๐Ÿ˜ŒBeen wanting to post since March but waited for the graduation photoโ€ฆ.Thrilled to finally share that Iโ€™ll be joining Princeton University as a postdoc Princeton PLI this August! Endless thanks to my incredible advisors and mentors from Penn, UW, Cornell, NYU, UCSB, USC,

๐Ÿ˜ŒBeen wanting to post since March but waited for the graduation photoโ€ฆ.Thrilled to finally share that Iโ€™ll be joining Princeton University as a postdoc <a href="/PrincetonPLI/">Princeton PLI</a> this August!

Endless thanks to my incredible advisors and mentors from Penn, UW, Cornell, NYU, UCSB, USC,
Mingyuan Wu (@mingyuanwu4) 's Twitter Profile Photo

Research with amazing collaborators Jize Jiang, Meitang Li, and Jingcheng Yang, guided by great advisors and supported by the generous help of talented researchers Bowen Jin, Xingyu Fu โœˆ๏ธ ICML25, and many open-source contributors (easyr1, verl, vllm... etc).

Xiang Yue@ICLR2025๐Ÿ‡ธ๐Ÿ‡ฌ (@xiangyue96) 's Twitter Profile Photo

People are racing to push math reasoning performance in #LLMsโ€”but have we really asked why? The common assumption is that improving math reasoning should transfer to broader capabilities in other domains. But is that actually true? In our study (arxiv.org/pdf/2507.00432), we

People are racing to push math reasoning performance in #LLMsโ€”but have we really asked why? The common assumption is that improving math reasoning should transfer to broader capabilities in other domains. But is that actually true?

In our study (arxiv.org/pdf/2507.00432), we
Weijia Shi (@weijiashi2) 's Twitter Profile Photo

Can data owners & LM developers collaborate to build a strong shared model while each retaining data control? Introducing FlexOlmo๐Ÿ’ช, a mixture-of-experts LM enabling: โ€ข Flexible training on your local data without sharing it โ€ข Flexible inference to opt in/out your data

Xingyu Fu (@xingyufu2) 's Twitter Profile Photo

I will be in #ICML2025 next week and present #ReFocus on Tuesday afternoon. ๐Ÿ“ West Exhibition Hall B2-B3 #W-202 โฑ๏ธ Tue 15 Jul 4:30 p.m. PDT - 7 p.m. PDT Happy to chat and connect! Feel free to DM ๐Ÿ˜ ReFocus link: huggingface.co/datasets/ReFocโ€ฆ

Yong Lin (@yong18850571) 's Twitter Profile Photo

(1/4)๐Ÿšจ Introducing Goedel-Prover V2 ๐Ÿšจ ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ The strongest open-source theorem prover to date. ๐Ÿฅ‡ #1 on PutnamBench: Solves 64 problemsโ€”with far less compute. ๐Ÿง  New SOTA on MiniF2F: * 32B model hits 90.4% at Pass@32, beating DeepSeek-Prover-V2-671Bโ€™s 82.4%. * 8B > 671B: Our 8B

(1/4)๐Ÿšจ Introducing Goedel-Prover V2 ๐Ÿšจ
๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ The strongest open-source theorem prover to date.
๐Ÿฅ‡ #1 on PutnamBench: Solves 64 problemsโ€”with far less compute.
๐Ÿง  New SOTA on MiniF2F:
* 32B model hits 90.4% at Pass@32, beating DeepSeek-Prover-V2-671Bโ€™s 82.4%.
* 8B &gt; 671B: Our 8B