AitorSoroa (@aitor57) Twitter Tweets • TwiCopy

AitorSoroa

@aitor57

+ Follow

ID: 297848855

calendar_today13-05-2011 06:53:21

623 Tweet

130 Followers

183 Following

DSN - Data Science Nigeria

@dsn_ai_network

a year ago

The highly anticipated #NeurIPS2024 conference, one of the largest in Machine Learning and computational neuroscience, kicks off today! Over the coming days, we’ll spotlight groundbreaking research being presented, starting with “BertaQA: How much do Language Models know about

The highly anticipated #NeurIPS2024 conference, one of the largest in Machine Learning and computational neuroscience, kicks off today!

Over the coming days, we’ll spotlight groundbreaking research being presented, starting with “BertaQA: How much do Language Models know about

thumb_up_off_alt46

chat_bubble_outline1

repeat6

shareShare

HiTZ zentroa (UPV/EHU)

a year ago

Un 'ChatGPT' euskaldun via Onda Cero ondacero.es/emisoras/pais-… #HiTZintheMedia

thumb_up_off_alt4

chat_bubble_outline0

repeat2

shareShare

HiTZ zentroa (UPV/EHU)

9 months ago

Adimen artifizialeko adituen bila zabiltza? HiTZ zentroan badituzu hamaika emakume! #M8 #martxoak8 (Argazkian beste asko falta zaizkigu!)

Adimen artifizialeko adituen bila zabiltza? HiTZ zentroan badituzu hamaika emakume! #M8 #martxoak8
(Argazkian beste asko falta zaizkigu!)

thumb_up_off_alt23

chat_bubble_outline0

repeat6

shareShare

Oscar Sainz

9 months ago

Oso garrantzitsua da gizartean eragina eduki dezaketen teknologiak modu ireki batean garatzea. HiTZ zentroa (UPV/EHU)|n helburu horrekin egiten dugu lan, lizentzia irekiko datuak erabiliz euskara eta euskal kultura hizkuntza-ereduei irakaten. Erronka honetan lagundu nahi? Ikusi 🧵

thumb_up_off_alt4

chat_bubble_outline0

repeat7

shareShare

HiTZ zentroa (UPV/EHU)

9 months ago

📊#Ebaluatoia-ren lehen 5 egunetako datuak! 📊 775+ erabiltzaile eta 6000+ bidalketa! 🚀 Mila esker guztioi! 💕 Erronka: 20000 bidalketa lortzea apirilaren 2a baino lehen! 🕒 Sartu ebaluatoia.hitz.eus eta egin zure galdera!

thumb_up_off_alt8

chat_bubble_outline0

repeat7

shareShare

HiTZ zentroa (UPV/EHU)

8 months ago

🎉Ebaluatoia amaitu da! 🎉 Guztira 1.680 pertsona erregistratu dira eta 12.890 bidalketa jaso ditugu! Mila esker parte hartu duzuen guztiei! ebaluatoia.hitz.eus 📅Adi! Zozketa apirilaren 10ean izango da, 15:00etan, Informatika Fakultatean edo zuzenean HiTZeko YT kanalean!

thumb_up_off_alt8

chat_bubble_outline0

repeat7

shareShare

Eneko Agirre @eagirre.bsky.social

7 months ago

Very proud of our researchers!

thumb_up_off_alt1

chat_bubble_outline0

repeat2

shareShare

HiTZ zentroa (UPV/EHU)

7 months ago

Esta semana hemos participado en #Pint25ES #pint25 con la charla "La IA en la torre de Babel". Ander Barrena Madinabeitia fue el ponente en #PINT25BIO y Eneko Agirre @eagirre.bsky.social en #pint25dss

Esta semana hemos participado en #Pint25ES #pint25 con la charla "La IA en la torre de Babel". <a href="/4nderB/">Ander Barrena Madinabeitia</a> fue el ponente en #PINT25BIO y <a href="/eagirre/">Eneko Agirre @eagirre.bsky.social</a> en #pint25dss

thumb_up_off_alt8

chat_bubble_outline0

repeat8

shareShare

HiTZ zentroa (UPV/EHU)

7 months ago

Zorionak Jaione! HiTZ zentroko ikertzaileak artikulu onenaren saria jaso du ingeniaritza eta arkitektura alorrean! #IKERGAZTE2025

thumb_up_off_alt13

chat_bubble_outline1

repeat7

shareShare

HiTZ zentroa (UPV/EHU)

6 months ago

Ostegunero, HiTZ zentroko kideak biltzen gara gure ikerketen berri emateko HiTZ mintegian. Aste honetan, bi tesi proiektu aurkeztu dira: Irune Zubiaga-k "Learning to Judge: Automated Multilingual Evaluation of LLM-Generated Text" eta Blanca C-F "Critical Questions Generation"

Ostegunero, HiTZ zentroko kideak biltzen gara gure ikerketen berri emateko HiTZ mintegian. Aste honetan, bi tesi proiektu aurkeztu dira: <a href="/iruzubiaga/">Irune Zubiaga</a>-k "Learning to Judge: Automated Multilingual Evaluation of LLM-Generated Text" eta <a href="/Blanca_C_Fi/">Blanca C-F</a> "Critical Questions Generation"

thumb_up_off_alt13

chat_bubble_outline0

repeat7

shareShare

HiTZ zentroa (UPV/EHU)

6 months ago

[1/7] #newHitzPaper Many languages are underserved by open LLMs, and face the following question: Which is the best way to produce open instruction-tuned LLMs for low-resource languages? We obtained great results for a cost-effective option! 📰 arxiv.org/abs/2506.07597

[1/7]
#newHitzPaper

Many languages are underserved by open LLMs, and face the following question: Which is the best way to produce open instruction-tuned LLMs for low-resource languages?

We obtained great results for a cost-effective option!

📰 arxiv.org/abs/2506.07597

thumb_up_off_alt20

chat_bubble_outline1

repeat12

shareShare

HiTZ zentroa (UPV/EHU)

6 months ago

[2/7] 🤔Why does this matter? • Most LLMs excel in English but struggle with low-resource languages like Basque (~1000x less data than English). • The standard instruction-tuning pipeline (base model → CPT → instruction tuning) may not be optimal for low-resource scenarios.

thumb_up_off_alt2

chat_bubble_outline1

repeat1

shareShare

HiTZ zentroa (UPV/EHU)

6 months ago

[3/7] 🔬 Our experimental setup: 17 model variants using different backbone models (base/instruct) and data combinations (Basque corpus, English/Basque synthetic instructions). Evaluated with 🎯 benchmarks AND🫂human preferences from 1,285 Basque speakers (12,890 annotations).

[3/7]
🔬 Our experimental setup: 17 model variants using different backbone models (base/instruct) and data combinations (Basque corpus, English/Basque synthetic instructions).

Evaluated with 🎯 benchmarks AND🫂human preferences from 1,285 Basque speakers (12,890 annotations).

thumb_up_off_alt2

chat_bubble_outline1

repeat1

shareShare

HiTZ zentroa (UPV/EHU)

6 months ago

[4/7] Key findings: 1⃣Language corpora is essential: models need exposure to plain Basque text 2⃣Starting from instructed models beats the standard base→instruct pipeline 3⃣English-only instructions work well, but combining with Basque instructions yields the most robust models

[4/7]
Key findings:
1⃣Language corpora is essential: models need exposure to plain Basque text
2⃣Starting from instructed models beats the standard base→instruct pipeline
3⃣English-only instructions work well, but combining with Basque instructions yields the most robust models

thumb_up_off_alt2

chat_bubble_outline1

repeat2

shareShare

HiTZ zentroa (UPV/EHU)

6 months ago

[5/7] 🎉 Bonus results! Our 70B model approaches the performance of frontier models like GPT-4o and Claude 3.5 Sonnet on both Basque benchmarks and human evaluation, even outperforming GPT-4o on local knowledge tasks.

[5/7]
🎉 Bonus results!

Our 70B model approaches the performance of frontier models like GPT-4o and Claude 3.5 Sonnet on both Basque benchmarks and human evaluation, even outperforming GPT-4o on local knowledge tasks.

thumb_up_off_alt2

chat_bubble_outline1

repeat2

shareShare

HiTZ zentroa (UPV/EHU)

6 months ago

[6/7] 🙏 Thanks to the Basque-speaking community for their participation! 💻 We're releasing models, synthetic instruction datasets, and human preference data to support future research on low-resource languages: github.com/hitz-zentroa/l…

thumb_up_off_alt4

chat_bubble_outline1

repeat1

shareShare

HiTZ zentroa (UPV/EHU)

6 months ago

[7/7] 👤 Authors: Oscar Sainz, Naiara Perez, Julen Etxaniz Joseba Fernandez de Landa, Itziar Aldabe, Iker García-Ferrero, Aimar Zabala, Ekhi Azurmendi, German Rigau, Eneko Agirre @eagirre.bsky.social, Mikel Artetxe & AitorSoroa

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

HiTZ zentroa (UPV/EHU)

6 months ago

We also had Maite Heredia present her PhD thesis so far, titled Evaluation of LLMs in Multilingual Settings: The Case of Code-Switching, which explores CS generation and evaluation for high- and low-resource language pairs.

We also had <a href="/maitehered/">Maite Heredia</a> present her PhD thesis so far, titled Evaluation of LLMs in Multilingual Settings: The Case of Code-Switching, which explores CS generation and evaluation for high- and low-resource language pairs.

thumb_up_off_alt6

chat_bubble_outline0

repeat3

shareShare

Eneko Agirre @eagirre.bsky.social

6 months ago

Hizkuntza askorentzat txatbot irekiak ez dira ondo aritzen. Zein da hizkuntza txikietarako txatbot irekiak sortzeko metodo onena? Berriki plazaratu den ikerlanean berri onak daude, euskararako kalitatezko txatbota eraikitzea lortu dugu! Oharra: labur.eus/fltqqify 1/8 🧵👇

Hizkuntza askorentzat txatbot irekiak ez dira ondo aritzen. Zein da hizkuntza txikietarako txatbot irekiak sortzeko metodo onena?

Berriki plazaratu den ikerlanean berri onak daude, euskararako kalitatezko txatbota eraikitzea lortu dugu!

Oharra: labur.eus/fltqqify

1/8 🧵👇

thumb_up_off_alt20

chat_bubble_outline1

repeat14

shareShare

HiTZ zentroa (UPV/EHU)

6 months ago

HiTZ zentroko bilera orokorra egin dugu aste honetan Iruñean We held the general meeting of the HiTZ center this week in Pamplona

HiTZ zentroko bilera orokorra egin dugu aste honetan Iruñean

We held the general meeting of the HiTZ center this week in Pamplona

thumb_up_off_alt27

chat_bubble_outline0

repeat7

shareShare