Yossi Keshet (@jkeshet) Twitter Tweets • TwiCopy

Yossi Keshet

3 years ago

Can we build a system that gives almost perfect auditory and visual feedback to learners of a new language? Read my medium post on a new algorithm to generate synthetic feedback of proper pronunciation from the wrong one in the speaker’s own voice. link.medium.com/ALdBTFfxisb

thumb_up_off_alt9

chat_bubble_outline0

repeat1

shareShare

Amazon Science

@amazonscience

3 years ago

Twenty years ago, Yossi Keshet, Amazon Scholar and Technion Israel associate professor, was working on the problem of automatic speech recognition—but he says it still isn't a solved problem. Find out why, where he sees gaps, and what he’s eager to explore in this research field. #ASR

thumb_up_off_alt18

chat_bubble_outline0

repeat7

shareShare

Yossi Keshet

@jkeshet

3 years ago

Ever wondered why Kim Kardashian sounds so cool and how it influences the accuracy of automatic speech recognizers? See the blog post of Bronya Roni Chernyak describing our joint work with Talia Ben Simon & Yael Segal, along with Eleanor Chodroff, @JeremySteffman & Jennifer Cole!

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Felix Kreuk

@felixkreuk

3 years ago

We present “AudioGen: Textually Guided Audio Generation”! AudioGen is an autoregressive transformer LM that synthesizes general audio conditioned on text (Text-to-Audio). 📖 Paper: tinyurl.com/audiogen-text2… 🎵 Samples: tinyurl.com/audiogen-text2… 💻 Code & models - soon! (1/n)

thumb_up_off_alt4,4K

chat_bubble_outline92

repeat917

shareShare

Yossi Keshet

@jkeshet

3 years ago

Do you want to speed up or slow down the speech while listening to podcasts or YouTube? Now you can do that with exceptional quality. Read Eyal Cohen's blog presenting our work on generating speech with outstanding quality. medium.com/@eyalcohen308/…

thumb_up_off_alt4

chat_bubble_outline4

repeat0

shareShare

NorthwesternLinguist

@linguisticsnu

3 years ago

"Using automatic acoustic analysis to reveal disruptions to speech articulation in individuals at risk for psychosis" K. Hitczenko Y. Segal Yossi Keshet Mittal ADAPT Lab @MattGoldrick Poster session 4aSC #ASA184 5/7

thumb_up_off_alt3

chat_bubble_outline2

repeat1

shareShare

Felix Kreuk

@felixkreuk

3 years ago

We present MusicGen: A simple and controllable music generation model. MusicGen can be prompted by both text and melody. We release code (MIT) and models (CC-BY NC) for open research, reproducibility, and for the music community: github.com/facebookresear…

thumb_up_off_alt1,1K

chat_bubble_outline39

repeat407

shareShare

arXiv Sound

@arxivsound

2 years ago

``DiffAR: Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation,'' Roi Benita, Michael Elad, Joseph Keshet, ift.tt/1UPCrnc

thumb_up_off_alt8

chat_bubble_outline0

repeat3

shareShare

Yossi Keshet

@jkeshet

2 years ago

Check out our latest speech synthesis work that can produce Vocal Fry - a voice register used to socially express avoidance, but also popular among celebrities & upwardly mobile women. With @RoiBenita and Michael Elad at #ICLR2024 arxiv.org/abs/2310.01381

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Andrey Cheptsov

@andrey_cheptsov

a year ago

A major open-source release! aiOla drops Whisper-Medusa, a model that is 50% faster than OpenAI’s Whisper. The model is based on the new "multi-head" attention architecture. Paper: paperswithcode.com/method/multi-h… GitHub: github.com/aiola-lab/whis… HuggingFace: huggingface.co/aiola/whisper-…

thumb_up_off_alt21

chat_bubble_outline1

repeat9

shareShare

Yossi Keshet

@jkeshet

a year ago

My aiOla team has just released Whisper-Medusa 50% faster than OpenAI's Whisper without sacrificing accuracy. It predicts up to 10 tokens simultaneously. github.com/aiola-lab/whis… #SpeechRecognition yaelsegal Aviv Shamsian Aviv Navon @gilhetz

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

ECE_Technion

@ece_technion

a year ago

Exciting news! Prof. Yossi Keshet from @TechnionECE joins forces with Prof. Bhiksha Raj as Chief Scientist at aiOla, pioneering next-gen AI speech tech! Proud of our alumnus Alon Peleg, serving as aiOla's COO! 🎓 Full story: radicaldatascience.wordpress.com/2024/10/21/two…

thumb_up_off_alt6

chat_bubble_outline1

repeat1

shareShare

aiOla

@_aiola

a year ago

Big News in Ethical AI! aiOla’s new open-source model automatically identifies, tags, and masks sensitive information—names, phone numbers, addresses—all in one seamless step during audio transcription. A true leap forward in privacy-first AI. huggingface.co/spaces/aiola/w…

thumb_up_off_alt2

chat_bubble_outline0

repeat2

shareShare

Yossi Keshet

@jkeshet

9 months ago

We didn’t build Jargonic just to top a leaderboard—we built it to thrive in the real world: noisy, unpredictable, and full of domain-specific jargon. That’s what makes this milestone so meaningful. #ai #speech

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

IEEE Speech and Language Processing

@ieeesltc

5 months ago

📢🌟🌟Call for ICASSP 2026 Speech and Language Processing Reviewer Nominations Please submit new speech and language processing reviewer nominations for ICASSP 2026 using the form below. docs.google.com/forms/d/1wtydY…

thumb_up_off_alt14

chat_bubble_outline0

repeat10

shareShare

aiOla

@_aiola

4 months ago

We are proud to have the best voice and speech AI lab in the world! Keep up the good work, team! 🚀 x.com/_akhaliq/statu…

thumb_up_off_alt15

chat_bubble_outline0

repeat2

shareShare