Maite Melero (@maitemelero1) 's Twitter Profile
Maite Melero

@maitemelero1

ID: 809736524887683072

calendar_today16-12-2016 12:26:46

636 Tweet

186 Followers

232 Following

Marta Villegas (@martavillegasm) 's Twitter Profile Photo

📢FLOR-6.3B, a new generative model for Catalan, Spanish & English based on BLOOM-7.1B. We modified the vocabulary and embedding layer, and continuously pre-trained the model with 140B tokens in our target languages 🚀huggingface.co/projecte-aina/… BSC-CNS Aina SomosNLP 👇

Maite Melero (@maitemelero1) 's Twitter Profile Photo

Projecció del documental All Static and Noise i conversa amb el poeta i activista Abduweli Ayup sobre la lluita del poble uigur per mantenir la seva llengua i cultura. cccb.org/ca/activitats/… CCCB

Maite Melero (@maitemelero1) 's Twitter Profile Photo

La población civil en la franja de Gaza está atrapada en medio de los bombardeos. Necesitan apoyo urgente. Por favor, haz tu donación para ayuda humanitaria de emergencia aquí: ayudagaza.com

Carlos Escolano (@carlosep93) 's Twitter Profile Photo

[1/7] Introducing "Investigating the translation capabilities of Large Language Models trained on parallel data only" To our knowledge, the first work studying translation on #LLM trained exclusively on parallel data. Arxiv paper: arxiv.org/abs/2406.09140

Carlos Escolano (@carlosep93) 's Twitter Profile Photo

[2/7] Along with the paper we release PLUME a family of 3 2B #LLM based on the Gemma architecture. Each model uses a different vocabulary size, from 32k up to 256k tokens. PLUME 32k: huggingface.co/projecte-aina/… PLUME 128k: huggingface.co/projecte-aina/… PLUME 256k: huggingface.co/projecte-aina/…

Carlos Escolano (@carlosep93) 's Twitter Profile Photo

[3/7] Our results show that these models can perform comparably to previous Encoder-Decoder methods and that larger vocabularies lead to better performance on both supervised and zero-shot translation directions.

[3/7] Our results show that these models can perform comparably to previous Encoder-Decoder methods and that larger vocabularies lead to better performance on both supervised and zero-shot translation directions.
Carlos Escolano (@carlosep93) 's Twitter Profile Photo

[4/7] Further analysis shows that different layers specialize in different parts of the prompt. Two clear patterns we observe are the presence of sink heads that attend to the <BOS> token, and a small amount of attention to the source language tag.

[4/7] Further analysis shows that different layers specialize in different parts of the prompt. Two clear patterns we observe are the presence of sink heads that attend to the &lt;BOS&gt; token, and a small amount of attention to the source language tag.
Carlos Escolano (@carlosep93) 's Twitter Profile Photo

[5/7] Given the previous findings we masked the heads with less attention coverage. Results show that more than 47% of model heads can be removed without losing more than 2 BLEU points. Being the 256k model the more resilient one, with 64,7% masked heads.

[5/7] Given the previous findings we masked the heads with less attention coverage. Results show that more than 47% of model heads can be removed without losing more than 2 BLEU points. Being the 256k  model the more resilient one, with 64,7% masked heads.
Carlos Escolano (@carlosep93) 's Twitter Profile Photo

[6/7] Finally, we study how the cross-lingual space is learned through the model layers. We observe that larger vocabulary sizes show smaller distances between languages, at the early and middle layers.

[6/7] Finally, we study how the cross-lingual space is learned through the model layers. We observe that larger vocabulary sizes show smaller distances between languages, at the early and middle layers.
Carlos Escolano (@carlosep93) 's Twitter Profile Photo

[7/7] This work has been conducted at BSC-CNS thanks to funding by Aina and Proyecto Ilenia. Also, thank my co-authors Javier García Gilabert (Javier García Gilabert), Aleix Sant Savall, Francesca De Luca Fornaciari, Audrey Mash, Xixian Liao, Maite Melero (Maite Melero )

Maite Melero (@maitemelero1) 's Twitter Profile Photo

Reivindicar el català com a llengua científica: publicat el primer resum sense traduir en una revista internacional ara.cat/1_4d68e2?utm_s… via diariARA

Linguapax (@infolinguapax) 's Twitter Profile Photo

Hi serem! Aprofitem per agrair a l'Emili Boix la feina que fa a Linguapax, com a membre de la Junta, i el bon humor que hi aporta!

Linguapax (@infolinguapax) 's Twitter Profile Photo

Avui és el Dia Internacional de les Llengües de Signes. Sabíeu que al món existeixen unes 300 llengües de signes? Formen part de la preciosa DIVERSITAT LINGÜÍSTICA mundial, però també són llengües minoritzades. youtu.be/_qO-ybCQQFI?si…

Linguapax (@infolinguapax) 's Twitter Profile Photo

📢 ATENCIÓ! Ja està oberta la convocatòria de candidatures per al 🏆#PremiLinguapax 2024. Teniu temps fins al 21 de febrer de 2025 per a presentar nominacions. Més informació: linguapax.org/convocatoria-d…

Maite Martín (@maite_martin) 's Twitter Profile Photo

🤬Una auténtica vergüenza!esta convocatoria ha sido un desastre desde q se convocó pero q ni siquiera la vayan a resolver, realmente esto es de países tercermundistas. A los políticos no les importa la ciencia (ni la IA) 😢#SinCienciaNoHayFuturo - elpais.com/tecnologia/202…