Sheng-Yu Wang (@shengyuwang6) 's Twitter Profile
Sheng-Yu Wang

@shengyuwang6

PhD Student @CarnegieMellon

ID: 1314449508747546625

linkhttp://peterwang512.github.io calendar_today09-10-2020 06:16:08

80 Tweet

388 Followers

540 Following

maxwell jones (@maxwell54650346) 's Twitter Profile Photo

I recently gave a talk at CMU about DeepSeek v3 and DeepSeek R1 (as many people are interested haha), and the talk was recorded so I thought I'd share both the video and the slides! Hopefully they can be of use :) video: youtu.be/qGpZAnYcOvs slides: maxwelljon.es/assets/pptx/De…

Nupur Kumari (@nupurkmr9) 's Twitter Profile Photo

Can we generate a training dataset of the same object in different contexts for customization? Check out our work SynCD, which uses Objaverse assets and shared attention in text-to-image models for the same. cs.cmu.edu/~syncd-project/ w/ Xi Yin Jun-Yan Zhu Ishan Misra Samaneh Azadi

Taesung Park (@taesung) 's Twitter Profile Photo

Excited to come out of stealth at Reve! Today's text-to-image/video models, in contrast to LLMs, lack logic. Images seem plausible initially but fall apart under scrutiny: painting techniques don't match, props don't carry meaning, and compositions lack intention. (1/4)

Excited to come out of stealth at <a href="/reveimage/">Reve</a>!
Today's text-to-image/video models, in contrast to LLMs, lack logic. Images seem plausible initially but fall apart under scrutiny: painting techniques don't match, props don't carry meaning, and compositions lack intention. (1/4)
Jun-Yan Zhu (@junyanz89) 's Twitter Profile Photo

Hi there, Phillip Isola and I wrote a short article (500 words) on Generative Modeling for the Open Encyclopedia of Cognitive Science. We briefly discuss the basic concepts of generative models and their applications. Don't miss out Phillip Isola's hand-drawn cats in Figure 1!

Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

LegoGPT: Generating Physically Stable and Buildable LEGO Designs from Text "We introduce LegoGPT, the first approach for generating physically stable LEGO brick models from text prompts. To achieve this, we construct a large-scale, physically stable dataset of LEGO designs,

LegoGPT: Generating Physically Stable and Buildable LEGO Designs from Text

"We introduce LegoGPT, the first approach for generating physically stable LEGO brick models from text prompts. To achieve this, we construct a large-scale, physically stable dataset of LEGO designs,
Jun-Yan Zhu (@junyanz89) 's Twitter Profile Photo

We've released the code for LegoGPT. This autoregressive model generates physically stable and buildable designs from text prompts, by integrating physics laws and assembly constraints into LLM training and inference. This work is led by PhD students Ava Pun, Kangle Deng,

Donglai Xiang (@donglaixiang) 's Twitter Profile Photo

🚨Excited to announce the 1st Workshop on Vision Meets Physics at @CVPR2025! Join us on June 12 for a full-day event exploring the synergy between physical simulation & computer vision to bridge the gap between the virtual and physical worlds. URL: tinyurl.com/vis-phys

🚨Excited to announce the 1st Workshop on Vision Meets Physics at @CVPR2025!

Join us on June 12 for a full-day event exploring the synergy between physical simulation &amp; computer vision to bridge the gap between the virtual and physical worlds.

URL: tinyurl.com/vis-phys
Lili (@lchen915) 's Twitter Profile Photo

One fundamental issue with RL – whether it’s for robots or LLMs – is how hard it is to get rewards. For LLM reasoning, we need ground-truth labels to verify answers. We found that maximizing confidence alone allows LLMs to improve their reasoning with RL!

Amil Dravid (@_amildravid) 's Twitter Profile Photo

Artifacts in your attention maps? Forgot to train with registers? Use π™©π™šπ™¨π™©-π™©π™žπ™’π™š π™§π™šπ™œπ™žπ™¨π™©π™šπ™§π™¨! We find a sparse set of activations set artifact positions. We can shift them anywhere ("Shifted") β€” even outside the image into an untrained token. Clean maps, no retrain.

Artifacts in your attention maps? Forgot to train with registers? Use π™©π™šπ™¨π™©-π™©π™žπ™’π™š π™§π™šπ™œπ™žπ™¨π™©π™šπ™§π™¨! We find a sparse set of activations set artifact positions. We can shift them anywhere ("Shifted") β€” even outside the image into an untrained token. Clean maps, no retrain.
Keenan Crane (@keenanisalive) 's Twitter Profile Photo

For folks in the ACM SIGGRAPH community: You may or may not be aware of the controversy around the next #SIGGRAPHAsia location, summarized here: cs.toronto.edu/~jacobson/webl… If you're concerned, consider signing this letter: docs.google.com/document/d/1ZS… via this form docs.google.com/forms/d/e/1FAI…

For folks in the <a href="/siggraph/">ACM SIGGRAPH</a> community:

You may or may not be aware of the controversy around the next #SIGGRAPHAsia location, summarized here: cs.toronto.edu/~jacobson/webl…

If you're concerned, consider signing this letter: docs.google.com/document/d/1ZS…
via this form
docs.google.com/forms/d/e/1FAI…
Skild AI (@skildai) 's Twitter Profile Photo

We’ve all seen humanoid robots doing backflips and dance routines for years. But if you ask them to climb a few stairs in the real world, they stumble! We took our robot on a walk around town to environments that it hadn’t seen before. Here’s how it worksπŸ§΅β¬‡οΈ

Nupur Kumari (@nupurkmr9) 's Twitter Profile Photo

🚨Reminder: Submissions for short papers to the Personalization in Generative AI Workshop at #ICCV2025 are due today!!! OpenReview: openreview.net/group?id=thecv…

Sirui Chen (@eric_srchen) 's Twitter Profile Photo

Introducing HEADπŸ€–, an autonomous navigation and reaching system for humanoid robots, which allows the robot to navigate around obstacles and touch an object in the environment. More details on our website and CoRL paper: stanford-tml.github.io/HEAD

Yufei Ye (@yufei_ye) 's Twitter Profile Photo

Delivering the robot close enough to a target is an important yet often overlooked prerequisite for any meaningful robot interaction. It requires robust locomotion, navigation, and reaching all at once. HeAD is an automatic vision-based system that handles all of them.

Gaurav Parmar (@gauravtparmar) 's Twitter Profile Photo

When exploring ideas with generative models, you want a range of possibilities. Instead, you often disappointingly get a gallery of near-duplicates. The culprit is standard I.I.D. sampling. We introduce a new inference method to generate high-quality and varied outputs. 1/n

When exploring ideas with generative models, you want a range of possibilities. Instead, you often disappointingly get a gallery of near-duplicates.

The culprit is standard I.I.D. sampling. We introduce a new inference method to generate high-quality and varied outputs. 

1/n