Marc Najork (@marc_najork) 's Twitter Profile
Marc Najork

@marc_najork

Research Engineering Director at Google

ID: 126132953

linkhttp://marc.najork.org calendar_today24-03-2010 23:05:44

110 Tweet

537 Followers

91 Following

Beliz Gunel (@belizgunel) 's Twitter Profile Photo

Join us in #KDD2021 for the DI workshop (document-intelligence.github.io/DI-2021/) if you are interested in #DocumentAI, an intersection that spans NLP, CV, knowledge representation, and more. I will present on Glean's efforts on data-efficient generalization to different languages and doc types.

Beliz Gunel (@belizgunel) 's Twitter Profile Photo

We build on the hypothesis that form-like documents share a visual design language and that the representation learning approach of Glean naturally enables multi-domain training and fine-tuning across considerably different document types and different languages, by its design.

Google AI (@googleai) 's Twitter Profile Photo

Recently, we released TF-Ranking—an open-source TF-based library—that makes building customized learning-to-rank models easier and facilitates fast exploration of new model structures for production and research. Learn more and grab the code on the blog ↓ goo.gle/3kKRRR7

DESIRES2024 (@desires_ir) 's Twitter Profile Photo

+++ SOCIAL PROGRAM +++ The DESIRES' social program is out! Check out where we will enjoy our fantastic walks, aperitifs and dinners. desires.dei.unipd.it/#schedule

wikimediatech (@wikimediatech) 's Twitter Profile Photo

Today marks the start of the Wikipedia Image/Caption Matching Challenge. Learn more in this new blog post by Miriam Redi, Fabian Kaelin, Tiziano Piccardi #kaggle #dataset #research #wikipedia #wikimedia techblog.wikimedia.org/2021/09/09/the…

Today marks the start of the Wikipedia Image/Caption Matching Challenge. Learn more in this new blog post by Miriam Redi, Fabian Kaelin, Tiziano Piccardi
#kaggle #dataset #research #wikipedia #wikimedia

techblog.wikimedia.org/2021/09/09/the…
Miriam Redi (@mad_astronaut) 's Twitter Profile Photo

Wikipedia is missing images, and its images are missing captions. With the Wikipedia Image/Caption Competition we are inviting everyone to help us address this gap! And we are releasing a dataset of millions of images from Wikipedia articles in 100+languages! I am SO excited😍

Google AI (@googleai) 's Twitter Profile Photo

Multimodal visio-linguistic models rely on rich datasets to model the relationship between images and text—today we introduce a new large multimodal dataset that is multilingual and the first to include contextual fields. Learn more about how it was built↓goo.gle/2W1P8J6

Marc Najork (@marc_najork) 's Twitter Profile Photo

Just received a CFP from the ‘International Journal of Management and Humanities (IJMH)’. Choice quote from the CFP: "Article can accept if plagiarism is less than 20%". Wow! I just hope that the editors are not only confused about grammar but also the meaning of "plagiarism".

Michael Bendersky (@bemikelive) 's Twitter Profile Photo

Our paper "Out-of-Domain Semantics to the Rescue! Zero-Shot Hybrid Retrieval Models" (by Tao Chen Mingyang Zhang Jing Lu Michael Bendersky Marc Najork; To appear in ECIR, 2022) is now on arXiv: arxiv.org/abs/2201.10582

Wiki Workshop 2025 (@wikiworkshop) 's Twitter Profile Photo

Congrats to the recipients of the Wikimedia Foundation Research Award of The Year!! 🎉"WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning" Srinivasan et al 🎉"Assessing the quality of sources in @Wikidata across languages: a hybrid approach" Amaral et al

Congrats to the recipients of the <a href="/Wikimedia/">Wikimedia Foundation</a> Research Award of The Year!!

🎉"WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning" Srinivasan et al

🎉"Assessing the quality of sources in @Wikidata across languages: a hybrid approach" Amaral et al
Marc Najork (@marc_najork) 's Twitter Profile Photo

Elon Musk I could use some "help from the top" with our Tesla Solar Roof project. Roof was installed Sept 2021 but final inspection with City of Palo Alto has not happened; permit will expire on August 25 after 180 days of inactivity. Tesla contact unresponsive since June 15.

Marc Najork (@marc_najork) 's Twitter Profile Photo

I just discovered a "creative re-use" of our 1999 Mercator paper on scalable web crawling. It employs "tortured phrases" -- using semantic mapping services (e.g. EN->FR->EN machine translation) to evade plagiarism detection software. Read all about it at marcnajork.blogspot.com/2022/10/tortur…

Michael Bendersky (@bemikelive) 's Twitter Profile Photo

If you're attending #KDD2023 hope you will check out some of the work our team will be presenting at the Applied Data Science Track this week. Congratulations to all the authors! Brief descriptions of the papers in🧵, feel free to reach out to the lead authors for details.