Kris Cao (@kroscoo) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Felix is directly responsible for me getting into NLP by responding to a cold email from a clueless maths grad and giving me a summer project. RIP Felix, every subsequent opportunity I’ve had has been downstream of that one small act.

thumb_up_off_alt162

chat_bubble_outline4

repeat4

shareShare

Command A(idan)

@aidangomez

7 months ago

I’m so excited to share something we’ve been working on for awhile: North is cohere’s AI workspace for enterprises. Today we’re releasing the platform for early access!

I’m so excited to share something we’ve been working on for awhile: North is <a href="/cohere/">cohere</a>’s AI workspace for enterprises. Today we’re releasing the platform for early access!

thumb_up_off_alt426

chat_bubble_outline17

repeat39

shareShare

Kris Cao

@kroscoo

7 months ago

As someone who briefly touched the transcendent (applied for PhDs in axiomatic set theory) this resonates strongly with me. I think that ‘genius’ is as much the effort of self-cultivation as it is birth, and that the route to mastery in any subject is surprisingly similar.

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Ollie

@ollie575563753

6 months ago

cohere is growing - We're hiring MLE's to build North in our London, New York and Toronto offices. We also support remote working, 'cos RTO mandates aren't our thing. Job Spec is below in the 🧵, with more info on North available here; cohere.com/north

thumb_up_off_alt17

chat_bubble_outline1

repeat1

shareShare

Robert Yang

@roberty970316

6 months ago

📷 Excited to share our new paper: "Rope to Nope and Back Again: A New Hybrid Attention Strategy" where we propose a novel architecture that outperforms RoPE-NTK-based approaches with full attention span. (1/8)

thumb_up_off_alt29

chat_bubble_outline1

repeat12

shareShare

Kris Cao

@kroscoo

5 months ago

once again the function-space view of neural networks leads to actionable insights. gaussian processes should (once again) be required knowledge for ML.

thumb_up_off_alt6

chat_bubble_outline0

repeat1

shareShare

Acyr Locatelli

@acyr_l

5 months ago

I'm hiring performance engineers for the pre-training team at Cohere. If you enjoy writing efficient kernels, hardware-aligned architecture design and optimisations, do reach out! Check out the live job posting here: jobs.ashbyhq.com/cohere/d42f5fd…

thumb_up_off_alt154

chat_bubble_outline2

repeat34

shareShare

Michael Hu

@michahu8

5 months ago

Training on a little 🤏 formal language BEFORE natural language can make pretraining more efficient! How and why does this work? The answer lies…Between Circuits and Chomsky. 🧵1/6👇

thumb_up_off_alt706

chat_bubble_outline18

repeat107

shareShare

Cohere Labs

@cohere_labs

5 months ago

Introducing ✨ Aya Vision ✨ - an open-weights model to connect our world through language and vision Aya Vision adds breakthrough multimodal capabilities to our state-of-the-art multilingual 8B and 32B models. 🌿

thumb_up_off_alt465

chat_bubble_outline19

repeat126

shareShare

Kris Cao

@kroscoo

5 months ago

we have a new model, it's pretty good and we like it, we think you'll like it too. (as an aside this is the first model i contributed to at cohere!)

thumb_up_off_alt39

chat_bubble_outline0

repeat1

shareShare

Kris Cao

@kroscoo

5 months ago

thinking about tree lstms again

thumb_up_off_alt7

chat_bubble_outline1

repeat0

shareShare

Kyle Duffy

@kyduffy

4 months ago

My team recently launched a best-in-class LLM specializing in English and Arabic. We just published a tech report explaining our methods. Check it out on arxiv: arxiv.org/abs/2503.14603

thumb_up_off_alt67

chat_bubble_outline4

repeat11

shareShare

Max Bartolo

@max_nlp

4 months ago

I'm excited to the tech report for our @Cohere Cohere For AI Command A and Command R7B models. We highlight our novel approach to model training including the use of self-refinement algorithms and model merging techniques at scale. Command A is an efficient, agent-optimised

I'm excited to the tech report for our @Cohere <a href="/CohereForAI/">Cohere For AI</a> Command A and Command R7B models. We highlight our novel approach to model training including the use of self-refinement algorithms and model merging techniques at scale. Command A is an efficient, agent-optimised

thumb_up_off_alt278

chat_bubble_outline9

repeat76

shareShare

Matthias Gallé

@mgalle

4 months ago

in case you are looking for the best model for COBOL, we might be able to help you here... cohere.com/research/paper…

thumb_up_off_alt59

chat_bubble_outline2

repeat9

shareShare

omer goldman

@omernlp

4 months ago

Wanna check how well a model can share knowledge between languages? Of course you do! 🤩 But can you do it without access to the model’s weights? Now you can with ECLeKTic 🤯

thumb_up_off_alt33

chat_bubble_outline1

repeat14

shareShare

Kris Cao

@kroscoo

3 months ago

the acl paper template will long outlive acl as an attractive destination for NLP papers

thumb_up_off_alt8

chat_bubble_outline1

repeat0

shareShare

Kilian Haefeli @ ICLR

@khshind

2 months ago

If you are based in Zurich (or anywhere rly) and write code for ML accelerators (including cuda/rocm) HMU

thumb_up_off_alt15

chat_bubble_outline1

repeat3

shareShare

Kris Cao

@kroscoo

2 months ago

rewinding model loss spikes is like saving before a difficult boss and reloading if you start losing

thumb_up_off_alt12

chat_bubble_outline0

repeat0

shareShare

Cohere Labs

@cohere_labs

2 months ago

How can we make language models more flexible to adapt to new languages after pretraining? 🌏 🧠 Our latest work investigates whether a tokenizer trained on more languages than the pretraining target can improve language plasticity without compromising pretraining performance.

thumb_up_off_alt79

chat_bubble_outline1

repeat20

shareShare

Diana Abagyan

@dianaabagyan

2 months ago

🚨New pretraining paper on multilingual tokenizers 🚨 Super excited to share my work with Cohere Labs: One Tokenizer To Rule Them All: Emergent Language Plasticity via Multilingual Tokenizers

🚨New pretraining paper on multilingual tokenizers 🚨

Super excited to share my work with <a href="/Cohere_Labs/">Cohere Labs</a>: One Tokenizer To Rule Them All: Emergent Language Plasticity via Multilingual Tokenizers

thumb_up_off_alt90

chat_bubble_outline3

repeat29

shareShare