Dan Deutsch (@_danieldeutsch) 's Twitter Profile
Dan Deutsch

@_danieldeutsch

Research Scientist at Google Translate working on text generation evaluation

ID: 821649037

linkhttps://danieldeutsch.github.io/ calendar_today13-09-2012 14:52:58

89 Tweet

611 Followers

89 Following

Dan Deutsch (@_danieldeutsch) 's Twitter Profile Photo

New application link! google.com/about/careers/… I am at EMNLP/WMT this week. Please come find me if you want to learn more about this role!

Dan Deutsch (@_danieldeutsch) 's Twitter Profile Photo

The Google Translate Research Team is looking for interns this summer! Apply here if you will graduate from a PhD program in the 2025-2026 academic year, and send me an email to let me know that you applied google.com/about/careers/…

Jurik Juraska (@jurikjuraska) 's Twitter Profile Photo

🌐 Meet MetricX-24, our SOTA machine translation evaluation metric and a successor to the successful MetricX-23. 🚀 Now open-source in PyTorch/Transformers! 🎉 Ready to take this top performer in the WMT24 Metrics Shared Task for a spin? 🔗 Code: github.com/google-researc…

Jurik Juraska (@jurikjuraska) 's Twitter Profile Photo

🚀 We have just released bfloat16 variants of all 3 MetricX-24 models, offering nearly identical performance to their float32 counterparts, but with a 50% smaller memory footprint. ✨ We hope this makes the XL and XXL models more accessible! 🔗 GitHub: github.com/google-researc…

Yusuf Kocyigit (@mykocyigit) 's Twitter Profile Photo

Thrilled to share our latest findings on data contamination, from my internship at Google! We trained almost 90 Models on 1B and 8B scales with various contamination types using machine translation as our task and analyze the impact of contamination. arxiv.org/abs/2501.18771

iseeaswell꩜bʂky (@iseeaswell) 's Twitter Profile Photo

😼SMOL DATA ALERT! 😼Anouncing SMOL, a professionally-translated dataset for 115 very low-resource languages! Paper: arxiv.org/pdf/2502.12301 Huggingface: huggingface.co/datasets/googl…

😼SMOL DATA ALERT! 😼Anouncing SMOL, a professionally-translated dataset for 115 very low-resource languages! Paper: arxiv.org/pdf/2502.12301
Huggingface: huggingface.co/datasets/googl…
Markus Freitag (@markuseful) 's Twitter Profile Photo

Two new datasets from Google Translate targeting high and low resource languages! WMT24++: 46 new en->xx languages to WMT24, bringing the total to 55 SMOL: 6M tokens for 115 very low-resource languages WMT24++: huggingface.co/datasets/googl… SMOL: huggingface.co/datasets/googl…