SRE Weekly Issue #434
SRE WEEKLY
by lex
4d ago
View on sreweekly.com A message from our sponsor, FireHydrant: We’ve gone all out on our new integration with Microsoft Teams. If you’re a MS Teams user, FireHydrant now supports the most comprehensive integration for incident management. Run the entire IM process without ever leaving the chat. https://firehydrant.com/blog/introducing-a-brand-new-microsoft-teams-integration/ Technical Details: Falcon Update for Windows Hosts The big news this week, of course, is the CrowdStrike-related series of outages in airports, banks, and many other businesses. Here’s their statement on the situation ..read more
Visit website
SRE Weekly Issue #433
SRE WEEKLY
by lex
1w ago
View on sreweekly.com A message from our sponsor, FireHydrant: We’ve gone all out on our new integration with Microsoft Teams. If you’re a MS Teams user, FireHydrant now supports the most comprehensive integration for incident management. Run the entire IM process without ever leaving the chat. https://firehydrant.com/blog/introducing-a-brand-new-microsoft-teams-integration/ 5 Non-Technical Skills Every Site Reliability Engineer Should Master This article covers five skills: Ability to Lead Taking Charge in Critical Situations Expressing Opinions in a Non-Conflicting Way Leading Initiati ..read more
Visit website
SRE Weekly Issue #432
SRE WEEKLY
by lex
2w ago
View on sreweekly.com A message from our sponsor, FireHydrant: We’ve gone all out on our new integration with Microsoft Teams. If you’re a MS Teams user, FireHydrant now supports the most comprehensive integration for incident management. Run the entire IM process without ever leaving the chat. https://firehydrant.com/blog/introducing-a-brand-new-microsoft-teams-integration/ Investigating Mysterious Kafka Broker I/O When Using Confluent Tiered Storage In this debugging story, an engineer wielded SystemTap to figure out why a Kafka broker was doing a ridiculous amount of reads.    ..read more
Visit website
SRE Weekly Issue #431
SRE WEEKLY
by lex
3w ago
View on sreweekly.com A message from our sponsor, FireHydrant: We’ve gone all out on our new integration with Microsoft Teams. If you’re a MS Teams user, FireHydrant now supports the most comprehensive integration for incident management. Run the entire IM process without ever leaving the chat. https://firehydrant.com/blog/introducing-a-brand-new-microsoft-teams-integration/ Cloudflare incident on June 20, 2024 This is a really thorny one. As individual subprocesses started infinitely looping, their system shifted load to other datacenters, masking the problem. A coinciding failure in the ..read more
Visit website
SRE Weekly Issue #430
SRE WEEKLY
by lex
1M ago
View on sreweekly.com A message from our sponsor, FireHydrant: We’ve gone all out on our new integration with Microsoft Teams. If you’re a MS Teams user, FireHydrant now supports the most comprehensive integration for incident management. Run the entire IM process without ever leaving the chat. https://firehydrant.com/blog/introducing-a-brand-new-microsoft-teams-integration/ r/sre: Senior SRE looking for a resume review, out of work for 7+ months now and still struggling to get interviews Lots of great tips in the comments if you’re looking to tune your resume.   u/goodolbluey an ..read more
Visit website
SRE Weekly Issue #429
SRE WEEKLY
by lex
1M ago
View on sreweekly.com A message from our sponsor, FireHydrant: We’ve gone all out on our new integration with Microsoft Teams. If you’re a MS Teams user, FireHydrant now supports the most comprehensive integration for incident management. Run the entire IM process without ever leaving the chat. https://firehydrant.com/blog/introducing-a-brand-new-microsoft-teams-integration/ Virtualizing Our Storage Engine Time to get down into the bits and bytes of how Honeycomb queries work with this look into a recent optimization in their data storage layer.   Hazel Edmands — Honeycomb   ..read more
Visit website
SRE Weekly Issue #428
SRE WEEKLY
by lex
1M ago
View on sreweekly.com A message from our sponsor, FireHydrant: We’ve gone all out on our new integration with Microsoft Teams. If you’re a MS Teams user, FireHydrant now supports the most comprehensive integration for incident management. Run the entire IM process without ever leaving the chat. https://firehydrant.com/blog/introducing-a-brand-new-microsoft-teams-integration/ The Reverse Red Herring This article presents in incident theme that I’ve lived through many times but never had such a pithy name for.   Geoff Townsend — Blameless Centralisation and distribution: When on ..read more
Visit website
SRE Weekly Issue #426
SRE WEEKLY
by lex
2M ago
View on sreweekly.com Got any burning questions to ask an experienced SRE? I’m gathering your questions in this google form, and I’d love to hear from you. I’m hoping to use your questions to help inspire authors looking to write more great SRE-related content. A message from our sponsor, FireHydrant: FireHydrant is now AI-powered for faster, smarter incidents! Power up your incidents with auto-generated real-time summaries, retrospectives, and status page updates. https://firehydrant.com/blog/ai-for-incident-management-is-here/ The Rule of 5 Errors If your overall request volume is low, s ..read more
Visit website
SRE Weekly Issue #424
SRE WEEKLY
by lex
2M ago
View on sreweekly.com A message from our sponsor, FireHydrant: FireHydrant is now AI-powered for faster, smarter incidents! Power up your incidents with auto-generated real-time summaries, retrospectives, and status page updates. https://firehydrant.com/blog/ai-for-incident-management-is-here/ My Availability Investment Playbook Here’s an ultra-practical guide to pushing for reliability investments at your company, formatted as a runbook with a set of specific steps.   Ross Brodbeck MemoryDB: Speed, Durability, and Composition. A neat dive into how Amazon’s MemoryDB composes ..read more
Visit website
SRE Weekly Issue #423
SRE WEEKLY
by lex
2M ago
A message from our sponsor, FireHydrant: FireHydrant is now AI-powered for faster, smarter incidents! Power up your incidents with auto-generated real-time summaries, retrospectives, and status page updates. https://firehydrant.com/blog/ai-for-incident-management-is-here/ How to Fight Alert Fatigue with Synthetic Monitoring This one’s full of great advice about making sure alerts are actionable, including alerting on flows that actually matter to customers.   Nočnica Mellifera — Checkly What playing Magic: the Gathering taught me about incidents. Here are a collection of thi ..read more
Visit website

Follow SRE WEEKLY on FeedSpot

Continue with Google
Continue with Apple
OR