SRE WEEKLY on Feedspot

SRE Weekly Issue #434

SRE WEEKLY

by lex

4d ago

View on sreweekly.com A message from our sponsor, FireHydrant: We’ve gone all out on our new integration with Microsoft Teams. If you’re a MS Teams user, FireHydrant now supports the most comprehensive integration for incident management. Run the entire IM process without ever leaving the chat. https://firehydrant.com/blog/introducing-a-brand-new-microsoft-teams-integration/ Technical Details: Falcon Update for Windows Hosts The big news this week, of course, is the CrowdStrike-related series of outages in airports, banks, and many other businesses. Here’s their statement on the situation ..read more

Visit website

SRE Weekly Issue #433

SRE WEEKLY

by lex

1w ago

View on sreweekly.com A message from our sponsor, FireHydrant: We’ve gone all out on our new integration with Microsoft Teams. If you’re a MS Teams user, FireHydrant now supports the most comprehensive integration for incident management. Run the entire IM process without ever leaving the chat. https://firehydrant.com/blog/introducing-a-brand-new-microsoft-teams-integration/ 5 Non-Technical Skills Every Site Reliability Engineer Should Master This article covers five skills: Ability to Lead Taking Charge in Critical Situations Expressing Opinions in a Non-Conflicting Way Leading Initiati ..read more

Visit website

SRE Weekly Issue #432

SRE WEEKLY

by lex

2w ago

View on sreweekly.com A message from our sponsor, FireHydrant: We’ve gone all out on our new integration with Microsoft Teams. If you’re a MS Teams user, FireHydrant now supports the most comprehensive integration for incident management. Run the entire IM process without ever leaving the chat. https://firehydrant.com/blog/introducing-a-brand-new-microsoft-teams-integration/ Investigating Mysterious Kafka Broker I/O When Using Confluent Tiered Storage In this debugging story, an engineer wielded SystemTap to figure out why a Kafka broker was doing a ridiculous amount of reads. ..read more

Visit website

SRE Weekly Issue #431

SRE WEEKLY

by lex

3w ago

View on sreweekly.com A message from our sponsor, FireHydrant: We’ve gone all out on our new integration with Microsoft Teams. If you’re a MS Teams user, FireHydrant now supports the most comprehensive integration for incident management. Run the entire IM process without ever leaving the chat. https://firehydrant.com/blog/introducing-a-brand-new-microsoft-teams-integration/ Cloudflare incident on June 20, 2024 This is a really thorny one. As individual subprocesses started infinitely looping, their system shifted load to other datacenters, masking the problem. A coinciding failure in the ..read more

Visit website

SRE Weekly Issue #430

SRE WEEKLY

by lex

1M ago

View on sreweekly.com A message from our sponsor, FireHydrant: We’ve gone all out on our new integration with Microsoft Teams. If you’re a MS Teams user, FireHydrant now supports the most comprehensive integration for incident management. Run the entire IM process without ever leaving the chat. https://firehydrant.com/blog/introducing-a-brand-new-microsoft-teams-integration/ r/sre: Senior SRE looking for a resume review, out of work for 7+ months now and still struggling to get interviews Lots of great tips in the comments if you’re looking to tune your resume. u/goodolbluey an ..read more

Visit website

SRE Weekly Issue #429

SRE WEEKLY

by lex

1M ago

View on sreweekly.com A message from our sponsor, FireHydrant: We’ve gone all out on our new integration with Microsoft Teams. If you’re a MS Teams user, FireHydrant now supports the most comprehensive integration for incident management. Run the entire IM process without ever leaving the chat. https://firehydrant.com/blog/introducing-a-brand-new-microsoft-teams-integration/ Virtualizing Our Storage Engine Time to get down into the bits and bytes of how Honeycomb queries work with this look into a recent optimization in their data storage layer. Hazel Edmands — Honeycomb ..read more

Visit website

SRE Weekly Issue #428

SRE WEEKLY

by lex

1M ago

View on sreweekly.com A message from our sponsor, FireHydrant: We’ve gone all out on our new integration with Microsoft Teams. If you’re a MS Teams user, FireHydrant now supports the most comprehensive integration for incident management. Run the entire IM process without ever leaving the chat. https://firehydrant.com/blog/introducing-a-brand-new-microsoft-teams-integration/ The Reverse Red Herring This article presents in incident theme that I’ve lived through many times but never had such a pithy name for. Geoff Townsend — Blameless Centralisation and distribution: When on ..read more

Visit website

SRE Weekly Issue #426

SRE WEEKLY

by lex

2M ago

View on sreweekly.com Got any burning questions to ask an experienced SRE? I’m gathering your questions in this google form, and I’d love to hear from you. I’m hoping to use your questions to help inspire authors looking to write more great SRE-related content. A message from our sponsor, FireHydrant: FireHydrant is now AI-powered for faster, smarter incidents! Power up your incidents with auto-generated real-time summaries, retrospectives, and status page updates. https://firehydrant.com/blog/ai-for-incident-management-is-here/ The Rule of 5 Errors If your overall request volume is low, s ..read more

Visit website

SRE Weekly Issue #424

SRE WEEKLY

by lex

2M ago

View on sreweekly.com A message from our sponsor, FireHydrant: FireHydrant is now AI-powered for faster, smarter incidents! Power up your incidents with auto-generated real-time summaries, retrospectives, and status page updates. https://firehydrant.com/blog/ai-for-incident-management-is-here/ My Availability Investment Playbook Here’s an ultra-practical guide to pushing for reliability investments at your company, formatted as a runbook with a set of specific steps. Ross Brodbeck MemoryDB: Speed, Durability, and Composition. A neat dive into how Amazon’s MemoryDB composes ..read more

Visit website

SRE Weekly Issue #423

SRE WEEKLY

by lex

2M ago

A message from our sponsor, FireHydrant: FireHydrant is now AI-powered for faster, smarter incidents! Power up your incidents with auto-generated real-time summaries, retrospectives, and status page updates. https://firehydrant.com/blog/ai-for-incident-management-is-here/ How to Fight Alert Fatigue with Synthetic Monitoring This one’s full of great advice about making sure alerts are actionable, including alerting on flows that actually matter to customers. Nočnica Mellifera — Checkly What playing Magic: the Gathering taught me about incidents. Here are a collection of thi ..read more

Visit website

Follow SRE WEEKLY on FeedSpot