
Goran Glavaš
@gg42554
Professor for #NLProc @Uni_WUE. Moving to Bluesky: bsky.app/profile/gglava…
ID: 382302464
https://sites.google.com/view/goranglavas 29-09-2011 20:48:47
405 Tweet
1,1K Followers
257 Following

🤔 If you're interested in more #multilingual news data for other #NLProc tasks, check out PolyNews 📰 on Hugging Face ! w/ 77 low & high-resource languages in 19 scripts 🌍 🤗 huggingface.co/datasets/aiana… 📃 arxiv.org/abs/2406.12634 w/ Fabian David Schmidt Goran Glavaš Heiko Paulheim Data and Web Science Group

Check out our massively multilingual and (partially) multi-parallel news dataset PolyNews! Great work by Andreea Iana on compiling this massively multilingual domain-specific data as well as on using it to improve multilingual sentence encoders for news recommendation!

Great work by Andreea Iana who put an immense effort to collect and clean such massively multi-parallel news dataset. I reckon that that such a domain-specific multi-parallel corpus is of quite some interest for the MT folks :)!

Great effort by Gregor Geigle: we test if explicit grounding objectives reduce hallucination of Large Vision-Language Models. We confirm that they yield better fine-grained image understanding performance, but this does not propagate to less hallucination in open captioning!



I really enjoyed working with Valentin Hofmann on this! The highlight of this work for me is Figure 6: rendering toponym names from their embeddings obtained from the LM after geoadaptation, we basically obtained the map (for the BCMS area)!


Intermediate code representations like LLVM can indeed be a great facilitator of cross-programming-language transfer for Code-LLMs! Well deserved Oustanding Paper Award for Indraneil Paul for this great work! It was a pleasure to be part of the effort!

If you're looking on the fly customization of your news recommendation function, then MANNeR is the framework for you! Great work by Andreea Iana!

🔎 What's beneath the surface of encoder architectures in news #recsys? 🤔 Our latest work w/ Goran Glavaš Heiko Paulheim goes beyond recommendation accuracy to shed💡on how news & user encoders behave w.r.t. representational similarity! 🔗 Read more: arxiv.org/abs/2410.01470 👇

Yes, come to Fabian David Schmidt's poster on Tuesday! (even I will be there and I haven't been to a conference in 2.5 years :))

If you're into Vision-LLMS, come check Gregor Geigle's amazing work! See you in Miami ;)

Tired of work that probes LLMs or uses them as agents? Andreea Iana will present something cool and different: come check her great work on flexible news recommendation.


If you're looking for a good recipe for training a multilingual LVLM or a just a very strong multilingual LVLM to use, supporting 100 languages (built following the identifed "optimal" recipe), check our latest work! Gregor Geigle and Florian Schneider as lead authors!

Great new work on multilingual news recommendation (NR) by Andreea Iana! New datasets for multilingual and cross-lingual NR as well as a SotA NR model, new domain-adapted from a multilingual sentence encoder!

Joint work with Florian Schneider, Chris Biemann, and Goran Glavaš My first paper on multilingual vision-language, and couldn't be happier how this work turned out!🙂


📢 Introducing Walk&Retrieve, a simple yet effective zero-shot #RAG framework based on #knowledgegraph walks! Arxiv : arxiv.org/abs/2505.16849 GitHub: github.com/MartinBoecklin… Joint work w/ Martin Böckling Heiko Paulheim Data and Web Science Group IR-RAG #SIGIR2025 Details 👇
