LDC (@ldcupenn) 's Twitter Profile
LDC

@ldcupenn

LDC creates and distributes language resources to universities, labs, companies and libraries for linguistic education, research and technology development.

ID: 2874341584

linkhttp://ldc.upenn.edu calendar_today12-11-2014 21:46:23

750 Tweet

387 Followers

0 Following

LDC (@ldcupenn) 's Twitter Profile Photo

MATERIAL Farsi-English Language Pack has 61 hours of Farsi conversational telephone speech, transcripts, English translations, annotations and queries designed to support cross language information retrieval bit.ly/3OMUNdJ

LDC (@ldcupenn) 's Twitter Profile Photo

Abstract Meaning Representation 3.0 - Machine Translations: AMR 3.0 training, development and test splits translated into Spanish, Irish Gaelic, and Dutch using Google Translate, developed at KU Leuven bit.ly/4g0yf5c

LDC (@ldcupenn) 's Twitter Profile Photo

There is only one week left to apply for LDC’s Spring Data Scholarship Program! Successful student candidates receive no-cost access to LDC data for their research. Submit your application by January 15. For program requirements, visit: ldc.upenn.edu/language-resou…

LDC (@ldcupenn) 's Twitter Profile Photo

Happy 2025 to all! Jump into the new year with LDC, getting the latest on 2025 membership discounts and two new publications, Iraqi Arabic – English Lexical Database and LORELEI Hungarian Representative Language Pack ldc-upenn.blogspot.com

LDC (@ldcupenn) 's Twitter Profile Photo

LDC’s Iraqi Arabic - English Lexical Database has over 67,000 Iraqi Arabic words in Arabic script and IPA notation and more than 120,000 English tokens, developed in collaboration with Georgetown University Press to update and enhance 1960s dictionaries bit.ly/4fTrlxU

LDC (@ldcupenn) 's Twitter Profile Photo

More LDC data in the LORELEI series: LORELEI Hungarian Representative Language Pack features monolingual and parallel text, annotations, software tools and more for human language technology development to address emergent situations bit.ly/3Pv79I0

LDC (@ldcupenn) 's Twitter Profile Photo

LDC’s February newsletter features LDC at LT4All 2025, data scholarship winners, reminders on 2025 membership discounts and two new publications, AIDA Scenario 3 Practice Topic Source Data and Annotation and MATERIAL Georgian-English Language Pack ldc-upenn.blogspot.com

LDC (@ldcupenn) 's Twitter Profile Photo

AIDA Scenario 3 Practice Topic Source Data and Annotation: 1417 English, Russian and Spanish web documents related to the COVID-19 pandemic with annotations for relations, events, entities and claim frames, developed by LDC bit.ly/40Xfbym

LDC (@ldcupenn) 's Twitter Profile Photo

MATERIAL Georgian-English Language Pack has 79 hours of Georgian conversational telephone speech, transcripts, English translations, annotations and queries designed to support cross language information retrieval bit.ly/4aXbNIt

LDC (@ldcupenn) 's Twitter Profile Photo

Happy International #MotherLanguageDay This year’s theme celebrates 25 years of efforts to preserve linguistic diversity and promote education in mother tongues. LDC is committed to making resources available for all languages of the world. un.org/en/observances…

Happy International #MotherLanguageDay This year’s theme celebrates 25 years of efforts to preserve linguistic diversity and promote education in mother tongues. LDC is committed to making resources available for all languages of the world. un.org/en/observances…
LDC (@ldcupenn) 's Twitter Profile Photo

LDC is pleased to be a sponsor of #LT4All (Feb 24-26), organized by ELRA and SIGUL, and in partnership with UNESCO as part of the International Decade of #IndigenousLanguages. This year’s theme is "Advancing Humanism through Language Technologies”. #LanguageTechForAll

LDC is pleased to be a sponsor of #LT4All (Feb 24-26), organized by ELRA and SIGUL, and in partnership with UNESCO as part of the International Decade of #IndigenousLanguages. This year’s theme is "Advancing Humanism through Language Technologies”.  #LanguageTechForAll
LDC (@ldcupenn) 's Twitter Profile Photo

Check out LDC’s March newsletter for details on two new publications, 2015 NIST LRE Evaluation Test Set and The Xi’an Multi-Language Learner Corpus ldc-upenn.blogspot.com

LDC (@ldcupenn) 's Twitter Profile Photo

2015 NIST LRE Evaluation Test Set has 867 hours of telephone speech and broadcast narrowband speech in 20 languages representing 6 clusters of related languages: Arabic, Spanish, English, Chinese, Slavic, and French bit.ly/3Fmq068

LDC (@ldcupenn) 's Twitter Profile Photo

The Xi’an Multi-Language Learner Corpus: 526 argumentative essays in 15 languages by Chinese L1 university undergraduate students studying second languages, collected in 2023 and 2024; developed by Xi’an International Studies University bit.ly/4hqKxDG

LDC (@ldcupenn) 's Twitter Profile Photo

The April newsletter introduces LDC’s upgraded website, welcomes Bluesky to our social media channels and has the latest on LDC’s two new publications, DEFT Spanish Light and Rich ERE Annotation and MATERIAL Kazakh-English Language Pack ldc-upenn.blogspot.com

LDC (@ldcupenn) 's Twitter Profile Photo

DEFT Spanish Light and Rich ERE Annotation: 158 Latin American discussion forum and Spanish newswire documents annotated for entities, relations and events, including conference (light) and event hoppers (rich), developed by LDC for the DARPA DEFT program bit.ly/3YcGCnd

LDC (@ldcupenn) 's Twitter Profile Photo

MATERIAL Kazakh-English Language Pack has 57 hours of Kazakh conversational telephone speech, transcripts, English translations, annotations and queries designed to support cross language information retrieval bit.ly/42cwe01

LDC (@ldcupenn) 's Twitter Profile Photo

Check out LDC’s May newsletter for two new companion releases developed by LDC to support the DARPA BOLT program, BOLT CTS CALLFRIEND CALLHOME Mandarin Chinese Audio and BOLT CTS CALLFRIEND CALLHOME Mandarin Chinese Transcripts and Translations ldc-upenn.blogspot.com

LDC (@ldcupenn) 's Twitter Profile Photo

BOLT CTS CALLFRIEND CALLHOME Mandarin Chinese Audio: 93 hours of telephone speech from 236 conversations between native speakers; developed by LDC for the DARPA BOLT program; contains previously unexposed calls from the CF/CH collections bit.ly/4kbsBPy

LDC (@ldcupenn) 's Twitter Profile Photo

BOLT CTS CALLFRIEND CALLHOME Mandarin Chinese Transcripts and Translations: transcripts and English translations for 93 hours of BOLT CTS telephone recordings; all speech was transcribed; 89% of the transcripts were translated bit.ly/4jKul2j