Natan Yellin (@aantn) 's Twitter Profile
Natan Yellin

@aantn

Take Kubernetes monitoring to the next level with Prometheus and @RobustaDev

ID: 17523402

linkhttps://robusta.dev calendar_today20-11-2008 21:48:32

2,2K Tweet

5,5K Followers

1,1K Following

Natan Yellin (@aantn) 's Twitter Profile Photo

DevOps failed. DevOps was supposed to be "devs doing ops" - or at least something close to that. There was never supposed to be a job called "DevOps Engineer". But where did it go wrong and why did everyone need to hire DevOps engineers anyway? More on that tomorrow.

Pavan Gudiwada (@pavangudiwada_) 's Twitter Profile Photo

Using AI at your org? See how your teams can improve results and speedup your incident resolution process in just a few minutes! Plus, you can do it using an OSS project. Here's the agenda...

Using AI at your org? See how your teams can improve results and speedup your incident resolution process in just a few minutes! Plus, you can do it using an OSS project.

Here's the agenda...
Natan Yellin (@aantn) 's Twitter Profile Photo

This is what AI for Prometheus alerts looks like. Below is an alert for HighLatency on an HTTP endpoint. If you want access, DM me or leave a comment.

This is what AI for Prometheus alerts looks like. Below is an alert for HighLatency on an HTTP endpoint.

If you want access, DM me or leave a comment.
Natan Yellin (@aantn) 's Twitter Profile Photo

What was wrong with DevOps? It assumed Devs would keep on developing + take on a myriad of new responsibilities & do them as well as dedicated teams. Yes, devs should be *involved* in other areas, but it's hubris to assume the other responsiblities are trivial, not disciplines.

Natan Yellin (@aantn) 's Twitter Profile Photo

There is no such thing in Kubernetes as a "pod CPU request". Requests and limits are specified per-container - the closest thing to a "pod request" is the sum of all container requests. Now that's changing with a new KEP adding pod-level requests/limits github.com/kubernetes/enh…

Natan Yellin (@aantn) 's Twitter Profile Photo

What if we made it EASY for devs to do "you build it, you run it" & gave them tools? But didn't ask them to maintain servers, be security experts, and so on. That's the gist of Platform Engineering. It's a healthier approach than "devs do ops". More on this in webinar tomorrow.

Natan Yellin (@aantn) 's Twitter Profile Photo

What happens in Kubernetes if you do a rolling update but pods are stuck in terminating? Do new pods get created or are the terminating pods still considered running? There's a KEP for that! github.com/kubernetes/enh…

Natan Yellin (@aantn) 's Twitter Profile Photo

Coming soon: HolmesGPT integration for Kafka, so you can see faster why Kafka alerts fire and what is causing lag. To request beta access, DM me or leave a comment and I'll get in touch.

Natan Yellin (@aantn) 's Twitter Profile Photo

Last year as CEO I made a risky decision. There was hype around AI, but skepticism had never been higher I chose to focus our attention on AI for PrometheusMonitoring alerts. A year later, it paid off. Here is the result and what it means for the future of on-call.

Natan Yellin (@aantn) 's Twitter Profile Photo

What's the most annoying Prometheus alert that is always a false-positive in your experience? (Especially from defaults in kube-prometheus-stack.) I'll go first - TargetDown is almost always PrometheusTargetMisconfigured.

Natan Yellin (@aantn) 's Twitter Profile Photo

Feedback wanted: which AI investigation of a Prometheus alert 👇 do you find most useful? We're playing with different output formats for HolmesGPT.

Feedback wanted: which AI investigation of a Prometheus alert 👇 do you find most useful?

We're playing with different output formats for HolmesGPT.
Jorge Arteiro (@jorgearteiro) 's Twitter Profile Photo

Are you at KubeCon Salt Lake City? watch new video i recorded with Natan Yellin Natan Yellin from Robusta.dev Using Azure AI and HolmesGPT as an AI Assistant for AKS Alerts. Robusta team is at KubeCon! They are close to CNCF store. #holmesGPT AzureTar youtu.be/hZhcD1D-Bgo?si…

Andrea (@alacolombiadev) 's Twitter Profile Photo

Fascinating chat with Natan Yellin on the future of observability: 🤖 How BYOLLM is the way forward 🚀 AI supercharging incident response 📈 HolmesGPT's impact on MTTR All open source, all epic!