Rahmad Mahendra (@rmahendrarm) Twitter Tweets • TwiCopy

Aran Komatsuzaki

5 years ago

Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets By manually auditing the quality of 205 language-specific corpora, they find that lower-resource corpora have systematic issues in quality. arxiv.org/abs/2103.12028

thumb_up_off_alt205

chat_bubble_outline4

repeat71

shareShare

Anna Rogers

@annargrs

4 years ago

#NLPaperAlert: QA Dataset Explosion!🔥 A survey of 200+ QA/RC datasets proposing a taxonomy of formats & reasoning skills. Also in the bag: modalities, conversational QA, domains & beyond-English data. Honored to work on this with Matt Gardner & Isabelle Augenstein arxiv.org/abs/2107.12708

thumb_up_off_alt254

chat_bubble_outline9

repeat69

shareShare

Jimmy Lin

@lintool

4 years ago

Yesterday Rodrigo Nogueira Andrew Yates and I wrapped up the final preproduction version of "Pretrained Transformers for Text Ranking: BERT and Beyond" - posted on arXiv as v3: arxiv.org/abs/2010.06467 now in the hands of Morgan & Claypool Publishers and will be in print soon!

thumb_up_off_alt81

chat_bubble_outline1

repeat16

shareShare

Rahmad Mahendra

@rmahendrarm

4 years ago

We are excited to share that our paper, "IndoNLI: A Natural Language Inference Dataset for Indonesian", is accepted at EMNLP 2025 main conference. Thanks and congrats to my co-authors: Clara Vania Alham Fikri Aji samuel_louvan Fahrurrozi Rahman #EMNLP2021 #NLProc

thumb_up_off_alt26

chat_bubble_outline3

repeat2

shareShare

Jia-Bin Huang

@jbhuang0604

4 years ago

How to write a paper that looks like a good one? You worked super hard and did great research, but somehow the reviewer 2 just doesn't buy it. Why? 🤔 It's probably because your paper does not look like a good paper *visually*. 🙄 How? 👇👇👇 #AcademicTwitter

thumb_up_off_alt2,2K

chat_bubble_outline22

repeat448

shareShare

Alan Ramponi

@alanramponi

4 years ago

r u intrstd in lxcl nrmlztion? 🧐 we present the shared task results at 6:50am GMT #wnut #emnlp2021 w/ Rob van der Goot Arkaitz Zubiaga Barbara Plank Benjamin Muller Iñaki San Vicente Nikola Ljubešić özlem çetinoğlu Rahmad Mahendra T.Çolakoğlu Tim Baldwin Tommaso Caselli @tommasoc80.bsky.social W.Sidorenko #NLProc IT&EN blog post below! 👇

thumb_up_off_alt14

chat_bubble_outline0

repeat5

shareShare

MIT CSAIL

@mit_csail

4 years ago

800 free computer science classes you can take online right now: bit.ly/800CSclasses

thumb_up_off_alt4,4K

chat_bubble_outline30

repeat1,1K

shareShare

Alham Fikri Aji

@alhamfikri

4 years ago

Did you know that 700+ languages are spoken among 200M+ people in 🇮🇩Indonesia? Yet only a tiny portion of them has been explored in the NLP world. Our upcoming #acl2022nlp paper describes Indonesian NLP's progress, challenges & opportunities. arxiv.org/abs/2203.13357 [1/6]

thumb_up_off_alt451

chat_bubble_outline6

repeat99

shareShare

Alham Fikri Aji

@alhamfikri

3 years ago

Finding Indonesian NLP resources is difficult, let's change that! If you have any NLP resources for Indonesian languages, you can share them through 🇮🇩NusaCrowd initiative, and be our co-author for our upcoming paper📜! Check our Github github.com/IndoNLP/nusa-c…

thumb_up_off_alt655

chat_bubble_outline6

repeat166

shareShare

Rahmad Mahendra

@rmahendrarm

3 years ago

Multilingual sentiment analysis dataset in Acehnese, Balinese, Banjarese, Buginese, Toba Batak, Madurese, Minangkabau, Javanese, (Dayak) Ngaju, Sundanese

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Samuel Cahyawijaya

@scahyawijaya

3 years ago

🚨 Exciting news! We are delighted to announce NusaCrowd, a new open-source initiative to collect and unite Indonesian NLP resources! 🇮🇩🇮🇩🇮🇩 Through NusaCrowd, we have gathered 137 datasets and 117 standardized data loaders covering text, audio, and image modalities💪🏾✨💕🌍

thumb_up_off_alt280

chat_bubble_outline5

repeat79

shareShare

Genta Winata

@gentaiscool

3 years ago

It has been 3⃣ years since we started the first initiative on the Indonesian benchmark, IndoNLU, and built IndoBERT as the foundation of IndoNLP 🇮🇩. We have seen so much progress 🥳 Repo: github.com/IndoNLP/indonlu follow the🧵to explore the journey ⛵️ #indonesian #indonlp @NLProc

thumb_up_off_alt288

chat_bubble_outline8

repeat58

shareShare

Alham Fikri Aji

@alhamfikri

3 years ago

🚨Join us on May 3rd to see our #eacl2023 poster and presentation on "NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages". This is our initiative to create an NLP resource for underrepresented 🇮🇩Indonesian languages. arxiv.org/abs/2205.15960

thumb_up_off_alt283

chat_bubble_outline4

repeat57

shareShare

Alham Fikri Aji

@alhamfikri

3 years ago

🇮🇩NusaX is awarded with Outstanding Paper Award 🎉 Amazing work by all coauthors. More work to come from Indonesian NLP community, stay tuned.

thumb_up_off_alt241

chat_bubble_outline11

repeat31

shareShare