
SysAdvent
1,230 FOLLOWERS
Systems Administration Advent Calendar provide great articles about systems administration topics written by fellow sysadmins.
SysAdvent
3y ago
By: Ania Kapuścińska (@lambdanis)
Edited by: Shaun Mouton (@sdmouton )
Like many engineers, for a long time I’ve thought of the Linux kernel as a black box. I've been using Linux daily for many years - but my usage was mostly limited to following the installation guide, interacting with the command line interface and writing bash scripts.
Some time ago I heard about eBPF (extended BPF). The first thing I heard was that it’s a programmable interface for the Linux kernel. Wait a second. Does that mean I can now inject my code into Linux without fully understanding all the internals and compiling ..read more
SysAdvent
3y ago
By: Joshua Timberman (@jtimberman)
You’re the SRE on call and, while working on a project, your phone buzzes with an alert: “Elevated 500s from API.”
You’re a software developer, and your team lead posts in Slack: “Hey, the library we use for our payment processing endpoint has a remote exploit.”
You work on the customer success team and, during a routine sync with a high-profile customer, they install the new version of your client CLI. Then, when they run any command, it exits with a non-zero return code.
An incident is any situation that disrupts the ability of customers to use a system, se ..read more
SysAdvent
3y ago
By: Jessica DeVita (@ubergeekgirl)
Edited by: Jennifer Davis (@sigje)
Deployment Decision-Making during the holidays amid the COVID19 Pandemic
A sneak peek into my forthcoming MSc. thesis in Human Factors and Systems Safety, Lund University.
Web services that millions of us depend on for work and entertainment require vast compute resources (servers, nodes, networking) and interdependent software services, each configured in specialized ways. The experts who work on these distributed systems are under enormous pressure to deploy new features, and keep the services running, so deployment decisi ..read more
SysAdvent
3y ago
By: Julie Gunderson (@Julie_Gund)
Edited by: Kerim Satirli (@ksatirli)
Intro
I recently left my role as a DevOps Advocate at PagerDuty to join the Gremlin team as a Sr. Reliability Advocate. The past few months have been an immersive experience into the world of Chaos Engineering and all things reliability. That said, my foray into the world of Chaos Engineering started long before joining the Gremlin team.
From my time as a lab researcher, to being a single parent, to dealing with cancer, I have learned that the journey of unpredictability is everywhere. I could never have imagined in college ..read more
SysAdvent
3y ago
By: Elias Voelker (@Elijah2807) and Faye Tandog (@fayetandog
Edited by: Jennifer Davis (@sigje)
Good IT monitoring stands and falls with its precision. Monitoring must inform you at the right time when something is wrong. But similar to statistics, you also have to deal with errors produced by your monitoring. In this post, I will talk about two types of errors - false positives and false negatives. And similar again to statistics, you can’t eliminate these errors completely in your monitoring. The best you can do is manage them and optimize for an acceptable level of errors.
In this article ..read more
SysAdvent
3y ago
By: Tyler Auerbeck (@tylerauerbeck)
Edited by: Ben Cotton (@funnelfiasco)
Thank you everyone for joining us today. We gather here to say our goodbyes to our dear friend, Localhost. They’ve been there for us through the good times, the bad times, and the “we should really be sleeping right now…but let me just try one last thing” times. They’ve held our overly-complicated terminal configurations and—in all likelihood—most of our secrets. But alas, it is time to let our good friend ride into the sunset.
Saying Goodbye
But why?! We’ve all likely spent more time than we care to admit making these m ..read more
SysAdvent
3y ago
By: Joe Block (@curiousbiped)
Edited by: Jennifer Davis (@sigje)
Background
Compute, even at home with consumer-grade hardware, has gotten ridiculously cheap. You can get a quad-core ARM machine with 4GB like a Raspberry Pi 4 for under $150, including power supply and SD card for booting - and it'll idle at less than 5 watts of power draw and be completely silent because it is fanless.
What we're going to do
In this post, I'll show you how to set up a Kubernetes cluster on a cheap ARM board (or an x86 box if you prefer) using k3s and k3sup so you can learn Kubernetes without breaking an enviro ..read more
SysAdvent
3y ago
By: Mandi Walls (@lnxchk)
Edited by: Joe Block (@curiousbiped)
Keeping track of all the data generated by a distributed ecosystem is a daunting task. When something goes wrong, or a service isn’t behaving properly, tracking down the culprit and getting the right folks enabled to fix it is also challenging. PagerDuty can help you with these challenges.
The PagerDuty platform integrates with over 600 other components to gather data, add context, and process automation. Under the hood of all of these integrations is the PagerDuty API, ready to help you programmatically interact with your PagerDut ..read more
SysAdvent
3y ago
By: Daniel Medina
Edited by: James Turnbull (@kartar)
A colleague supporting our recruitment efforts asked hiring managers if their "job descriptions are still partying like it's 1999?" The point was to revisit old postings that had been copy-and-pasted down the years and create something that would increase engagement with candidates. But reading the title made me think about a job I applied for (and got) circa 1999. It was a systems administrator role and included language like
The associate must regularly lift and/or move 20-35 pounds and occasionally lift or pull 35-80 pounds.
No joke, tho ..read more
SysAdvent
3y ago
By: Amar Sattaur
Edited by: Jennifer Davis (@sigje)
Recently, I've been thinking a lot about how to implement the concepts of least privilege while also speeding up the feedback cycle in the developer workflow. However, these two things are not very quickly intertwined. Therefore, there needs to be underlying tooling and visibility to show developers the data they need for a successful PR merge.
A developer doesn't care about what those underlying tools are; they just want access to a system where they can:
See the logs of the app that they're making a change for and the other relevant apps
S ..read more