Simone Tedeschi (@simonetedeschi_) 's Twitter Profile
Simone Tedeschi

@simonetedeschi_

Applied Scientist @Amazon AGI • PhD @SapienzaRoma

ID: 1430393813588226052

calendar_today25-08-2021 04:57:29

268 Tweet

1,1K Followers

1,1K Following

Barbara McGillivray (@barbaramcgilli) 's Twitter Profile Photo

Iacopo Ghinassi just presented our paper on Latin word sense disambiguation LREC COLING 2024 : we used language pivoting on English to boost the task on Latin. More research on this to come, watch this space!

Iacopo Ghinassi just presented our paper on Latin word sense disambiguation <a href="/LrecColing/">LREC COLING 2024</a> : we used language pivoting on English to boost the task on Latin. More research on this to come, watch this space!
Hoyeon Chang (@hoyeon_chang) 's Twitter Profile Photo

🚨 New paper 🚨 How Large Language Models Acquire Factual Knowledge During Pretraining? I’m thrilled to announce the release of my new paper! 🎉 This research explores how LLMs acquire and retain factual knowledge during pretraining. Here are some key insights:

🚨 New paper 🚨
How Large Language Models Acquire Factual Knowledge During Pretraining?

I’m thrilled to announce the release of my new paper! 🎉

This research explores how LLMs acquire and retain factual knowledge during pretraining. Here are some key insights:
Babelscape (@babelscape) 's Twitter Profile Photo

We are proud to share that our paper, "CNER: Concept and Named Entity Recognition", a joint work with SapienzaNLP, has been presented at #NAACL24! 🥳 Looking forward to engaging with the community. #NAACL2024 #AI #NLProc #Research #NER

We are proud to share that our paper, "CNER: Concept and Named Entity  Recognition", a joint work with <a href="/SapienzaNLP/">SapienzaNLP</a>, has been presented at #NAACL24! 🥳 Looking forward to engaging with the community. #NAACL2024 #AI #NLProc #Research #NER
Hitesh Patel (@hitesh_lpatel) 's Twitter Profile Photo

ADVSCORE: A Metric for the Evaluation and Creation of Adversarial Benchmarks This paper introduces ADVSCORE, a metric to evaluate and create high-quality adversarial datasets. ADVQA, a robust question answering dataset effectively fools models while not humans. This approach

ADVSCORE: A Metric for the Evaluation and Creation of Adversarial Benchmarks

This paper introduces ADVSCORE, a metric to evaluate and create high-quality adversarial datasets. ADVQA, a robust question answering dataset effectively fools models while not humans. This approach
Hitesh Patel (@hitesh_lpatel) 's Twitter Profile Photo

ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming The paper introduces ALERT, a benchmark for assessing the safety of LLMs. It employs a fine-grained risk taxonomy to evaluate LLMs propensity to generate harmful content and

ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming

The paper introduces ALERT, a benchmark for assessing the safety of LLMs. It employs a fine-grained risk taxonomy to evaluate LLMs propensity to generate harmful content and
Steffi Chern (@steffichern) 's Twitter Profile Photo

🚀How can we effectively evaluate and prevent superintelligent LLMs from deceiving others? We introduce 🤝BeHonest, a pioneering benchmark specifically designed to assess the honesty in LLMs comprehensively. Paper 📄: [arxiv.org/abs/2406.13261] Code 👨🏻‍💻: [github.com/GAIR-NLP/BeHon…]

🚀How can we effectively evaluate and prevent superintelligent LLMs from deceiving others?
We introduce 🤝BeHonest, a pioneering benchmark specifically designed to assess the honesty in LLMs comprehensively.

Paper 📄: [arxiv.org/abs/2406.13261]
Code 👨🏻‍💻: [github.com/GAIR-NLP/BeHon…]
Anka Reuel | @ankareuel.bsky.social (@ankareuel) 's Twitter Profile Photo

Our new paper "Open Problems in Technical AI Governance" led by Ben Bucknall & me is out! We outline 89 open technical issues in AI governance, plus resources and 100+ research questions that technical experts can tackle to help AI governance efforts🧵 t.ly/Y-mQ1

Rongwu Xu (@rongwu_xu) 's Twitter Profile Photo

☕️New paper 👉Our latest paper delves into LLMs' ability to perform safety self-correction, namely COURSE-CORRECTION. In this paper, we: - Benchmark course-correction ability - Improving using synthetic preferences. Paper: arxiv.org/pdf/2407.16637 Code: github.com/pillowsofwind/…

☕️New paper

👉Our latest paper delves into LLMs' ability to perform safety self-correction, namely COURSE-CORRECTION.

In this paper, we:
- Benchmark course-correction ability
- Improving using synthetic preferences.

Paper: arxiv.org/pdf/2407.16637
Code: github.com/pillowsofwind/…
Babelscape (@babelscape) 's Twitter Profile Photo

Four of our industrial #PhD students, Stefan Bejgu, Pere-Lluís Huguet Cabot, Alessandro Scirè and Simone Tedeschi, were awarded their #PhD in #AI last Friday with the best grades (and two cum laude)! Congrats all! 👏 🎉 With Roberto Navigli, their advisor and Babelscape's scientific director, in the photo

Four of our industrial #PhD students, <a href="/SBejgu/">Stefan Bejgu</a>, <a href="/PereLluisHC/">Pere-Lluís Huguet Cabot</a>, <a href="/alescire94/">Alessandro Scirè</a> and <a href="/SimoneTedeschi_/">Simone Tedeschi</a>, were awarded their #PhD in #AI last Friday with the best grades (and two cum laude)! Congrats all! 👏 🎉 With <a href="/RNavigli/">Roberto Navigli</a>, their advisor and Babelscape's scientific director, in the photo
SapienzaNLP (@sapienzanlp) 's Twitter Profile Photo

Last week 5 in our group received their #PhD in #AI & #Engineering in #ComputerScience! Stefan Bejgu, Pere-Lluís Huguet Cabot, Riccardo Orlando, Alessandro Scirè, and Simone Tedeschi, all with the highest grade (+2 cum laude)! Congrats all: we are very proud of you! Four of them were/are @Babelscape

Last week 5 in our group received their #PhD in #AI &amp;  #Engineering in #ComputerScience! <a href="/SBejgu/">Stefan Bejgu</a>, <a href="/PereLluisHC/">Pere-Lluís Huguet Cabot</a>, <a href="/RiccardoRicOrl/">Riccardo Orlando</a>, <a href="/alescire94/">Alessandro Scirè</a>, and <a href="/SimoneTedeschi_/">Simone Tedeschi</a>, all with the highest grade (+2 cum laude)! Congrats all: we are very proud of you! Four of them were/are @Babelscape
Emmy Liu (@_emliu) 's Twitter Profile Photo

What design decisions in LLM training affect the final performance of LLMs? Scaling model size and training data is important, but it's not the only thing. We performed an analysis of 90+ open-weights models to answer this question. 🧵 arxiv.org/abs/2503.03862 (1/12)

What design decisions in LLM training affect the final performance of LLMs?

Scaling model size and training data is important, but it's not the only thing. We performed an analysis of 90+ open-weights models to answer this question. 🧵

arxiv.org/abs/2503.03862

(1/12)