
Zihao Ye
@ye_combinator
Building flashinfer (github.com/flashinfer-ai/…)
ID: 916605919210827777
https://homes.cs.washington.edu/~zhye/ 07-10-2017 10:07:35
118 Tweet
1,1K Followers
511 Following









If you are around in the Bay Area, make sure to attend the #MLSys2025 keynote tomorrow by Soumith Chintala at the Santa Clara Convention Center. Checkout the full program at mlsys.org







Been excited about this talk for a while, Songlin Yang on efficient architecture! Just started! youtube.com/watch?v=j4zJbr…

Another 🔥 blog about CUTLASS from Colfax International, this time focusing on the gory details of block-scaled MXFP and NVFP data types and Blackwell kernels for them. research.colfax-intl.com/cutlass-tutori…


SGLang is an early user of FlashInfer and witnessed its rise as the de facto LLM inference kernel library. It won best paper at MLSys 2025, and Zihao now leads its development NVIDIA AI Developer. SGLang’s GB200 NVL72 optimizations were made possible with strong support from the
