Goran Glavaš (@gg42554) 's Twitter Profile
Goran Glavaš

@gg42554

Professor for #NLProc @Uni_WUE. Moving to Bluesky: bsky.app/profile/gglava…

ID: 382302464

linkhttps://sites.google.com/view/goranglavas calendar_today29-09-2011 20:48:47

405 Tweet

1,1K Followers

257 Following

Andreea Iana (@iana_andreea) 's Twitter Profile Photo

🤔 If you're interested in more #multilingual news data for other #NLProc tasks, check out PolyNews 📰 on Hugging Face ! w/ 77 low & high-resource languages in 19 scripts 🌍 🤗 huggingface.co/datasets/aiana… 📃 arxiv.org/abs/2406.12634 w/ Fabian David Schmidt Goran Glavaš Heiko Paulheim Data and Web Science Group

Goran Glavaš (@gg42554) 's Twitter Profile Photo

Check out our massively multilingual and (partially) multi-parallel news dataset PolyNews! Great work by Andreea Iana on compiling this massively multilingual domain-specific data as well as on using it to improve multilingual sentence encoders for news recommendation!

Goran Glavaš (@gg42554) 's Twitter Profile Photo

Great work by Andreea Iana who put an immense effort to collect and clean such massively multi-parallel news dataset. I reckon that that such a domain-specific multi-parallel corpus is of quite some interest for the MT folks :)!

Goran Glavaš (@gg42554) 's Twitter Profile Photo

Great effort by Gregor Geigle: we test if explicit grounding objectives reduce hallucination of Large Vision-Language Models. We confirm that they yield better fine-grained image understanding performance, but this does not propagate to less hallucination in open captioning!

Goran Glavaš (@gg42554) 's Twitter Profile Photo

Can your Large Vision-Language Model differentiate tell a Keeshond from a Samoyed? We show that fine-grained object classification is a skill quite complementary to image understanding tested by existing benchmarks and that LVLMs don't excel on the task, to say the least.

Goran Glavaš (@gg42554) 's Twitter Profile Photo

I really enjoyed working with Valentin Hofmann on this! The highlight of this work for me is Figure 6: rendering toponym names from their embeddings obtained from the LM after geoadaptation, we basically obtained the map (for the BCMS area)!

UKP Lab (@ukplab) 's Twitter Profile Photo

Code LMs are improving fast 📈, but they are limited in low-resource programming languages (PLs). 😬 In this #ACL2024NLP paper, we pre-train code LMs on source-compiler IR pairs for low-resource PLs💪 – 🧵 (1/7) Poster: Mon 4 PM - Oral: Wed 10:30 AM 📄: arxiv.org/abs/2403.03894

Code LMs are improving fast 📈, but they are limited in low-resource programming languages (PLs). 😬

In this #ACL2024NLP paper, we pre-train code LMs on source-compiler IR pairs for low-resource PLs💪 – 🧵 (1/7)
Poster: Mon 4 PM -  Oral: Wed 10:30 AM
📄: arxiv.org/abs/2403.03894
Goran Glavaš (@gg42554) 's Twitter Profile Photo

Intermediate code representations like LLVM can indeed be a great facilitator of cross-programming-language transfer for Code-LLMs! Well deserved Oustanding Paper Award for Indraneil Paul for this great work! It was a pleasure to be part of the effort!

Goran Glavaš (@gg42554) 's Twitter Profile Photo

If you're looking on the fly customization of your news recommendation function, then MANNeR is the framework for you! Great work by Andreea Iana!

Andreea Iana (@iana_andreea) 's Twitter Profile Photo

🔎 What's beneath the surface of encoder architectures in news #recsys? 🤔 Our latest work w/ Goran Glavaš Heiko Paulheim goes beyond recommendation accuracy to shed💡on how news & user encoders behave w.r.t. representational similarity! 🔗 Read more: arxiv.org/abs/2410.01470 👇

Goran Glavaš (@gg42554) 's Twitter Profile Photo

Tired of work that probes LLMs or uses them as agents? Andreea Iana will present something cool and different: come check her great work on flexible news recommendation.

Goran Glavaš (@gg42554) 's Twitter Profile Photo

Great work by fschmidt! Afaik, it's the first massively multilingual benchmark for spoken language understanding (and not just topical classification of speech utterances :). Ready "out-of-the-box" on HF datasets. Paper coming soon (but all important details already described).

Goran Glavaš (@gg42554) 's Twitter Profile Photo

If you're looking for a good recipe for training a multilingual LVLM or a just a very strong multilingual LVLM to use, supporting 100 languages (built following the identifed "optimal" recipe), check our latest work! Gregor Geigle and Florian Schneider as lead authors!

Goran Glavaš (@gg42554) 's Twitter Profile Photo

Great new work on multilingual news recommendation (NR) by Andreea Iana! New datasets for multilingual and cross-lingual NR as well as a SotA NR model, new domain-adapted from a multilingual sentence encoder!

Fabian David Schmidt (@fdschmidt) 's Twitter Profile Photo

Joint work with Florian Schneider, Chris Biemann, and Goran Glavaš My first paper on multilingual vision-language, and couldn't be happier how this work turned out!🙂

Andreea Iana (@iana_andreea) 's Twitter Profile Photo

📢 Introducing Walk&Retrieve, a simple yet effective zero-shot #RAG framework based on #knowledgegraph walks! Arxiv : arxiv.org/abs/2505.16849 GitHub: github.com/MartinBoecklin… Joint work w/ Martin Böckling Heiko Paulheim Data and Web Science Group IR-RAG #SIGIR2025 Details 👇

📢 Introducing Walk&Retrieve, a simple yet effective zero-shot #RAG framework based on #knowledgegraph walks! 

Arxiv : arxiv.org/abs/2505.16849
GitHub: github.com/MartinBoecklin…

Joint work w/ Martin Böckling <a href="/heikopaulheim/">Heiko Paulheim</a> <a href="/dwsunima/">Data and Web Science Group</a>   

<a href="/ir_rag_sigir/">IR-RAG</a> #SIGIR2025

Details 👇