Shijie Wang (@wananxy1) 's Twitter Profile
Shijie Wang

@wananxy1

software engineer @AlibabaGroup, ex @HuazhongUST @AlibabaGroup @OneFlowNews. github: github.com/simonJJJ

ID: 1101717121418055680

calendar_today02-03-2019 05:33:25

39 Tweet

178 Followers

602 Following

Rowan Zellers (@rown) 's Twitter Profile Photo

I wrote a blog post on why I decided to join OpenAI instead of academia. (after I went on the academic & industry job markets, and got offers from both.) This post (pt2 in a series) took a while 😅- hoping my experience helps others make life decisions! rowanzellers.com/blog/rowan-job…

Justin Johnson (@jcjohnss) 's Twitter Profile Photo

I'm excited about Segment Anything released from FAIR today. It tackles an old problem (find objects in images) at large scale: trained on 11M images and 1B objects. This is a new Foundation Model for Computer Vision - it recognizes any object in any context.

Junyang Lin (@justinlin610) 's Twitter Profile Photo

Just made a demo space (unofficial ) of ImageBind for zero-shot image classification:) Hope it helps you figure out its embedding quality. Will add demos of other modalities soon. See huggingface.co/spaces/JustinL…

Junyang Lin (@justinlin610) 's Twitter Profile Photo

Happy to release ONE-PEACE, a general representation model towards unlimited modalities (vision, language, audio, vision-language, etc., for now). New SOTAs plus emergent zeroshot capabilties. Code to be released. abs: arxiv.org/abs/2305.11172 code: github.com/OFA-Sys/ONE-PE…

Happy to release ONE-PEACE, a general representation model towards unlimited modalities (vision, language, audio, vision-language, etc., for now). New SOTAs plus emergent zeroshot capabilties. Code to be released. 
abs: arxiv.org/abs/2305.11172
code: github.com/OFA-Sys/ONE-PE…
Junyang Lin (@justinlin610) 's Twitter Profile Photo

On the Chinese Valentines' day, we release the Multimodal (Vision-Language model), Qwen-VL, which is based on our previously released model Qwen-7B! This model can perform perform multi-image interleaved conversation (chat with images), visual grounding, text recognition, etc.

On the Chinese Valentines' day, we release the Multimodal (Vision-Language model), Qwen-VL, which is based on our previously released model Qwen-7B! This model can perform perform multi-image interleaved conversation (chat with images), visual grounding, text recognition, etc.
Xinggang Wang (@xinggangwang) 's Twitter Profile Photo

Thrilled to have two papers featured among paperdigest.org's most influential at top-tier AI conferences! 4DGS ranks 3rd out of 2719 CVPR'24 papers, and Vision Mamba ranks 2nd out of 2609 ICML'24 papers (1st if considering citation counts alone).

Thrilled to have two papers featured among paperdigest.org's most influential at top-tier AI conferences! 4DGS ranks 3rd out of 2719 CVPR'24 papers, and Vision Mamba ranks 2nd out of 2609 ICML'24 papers (1st if considering citation counts alone).