The why
It’s been almost a year since I moved from a dedicated Software Engineering role to a Site Reliability Engineering role which has a majority Software Engineering component as well. The primary reason for this shift was because I started to become bored of the monotony of monolithic Software maintenance. I have been working in the Software Development and Software Engineering field for the last 10 years. I have experienced both Service oriented and Product oriented Software Development in my career. Sooner or later I always end up in a position where I largely maintain a handful of codebases and see them grow as I onboard more people onto it. And I realized I do not enjoy the process anymore.
As more of my skills and knowledge grow in a certain product, it’s almost inevitable that I would become limited to only certain codebases. Sure, I can grow into other areas. But then I have to spend dedicated time in the codebase again to make myself own part of it. Which becomes the same cycle. I prefer to explore the parts that interest me or concern me and then I want to move on exploring more. Almost nothing is worth spending the time to understand the code by heart. Codebases are always changing, Frameworks are frequently considered too limited for new implementation requirements. The real value of knowledge lies in knowing about the nature of the products and the market that we operate in.
Throughout the years I found that I do not actually like to go too deep into the abyss that is a codebase. I would not find anything special in there. Honestly, what wouuld it be that would really excite us? An ingenious for loop? A meticulously crafted recursive routine? A pattern that you would be able to read in a GOF page? Sure some people might find doing that sort of stuff on a daily basis fun but no realistic Software Development & Engineering can provide that thrill on a daily basis. You always have deadlines to meet and going on your refactoring adventures isn’t the kind of time any team can provide constantly. And in rare cases when you do get the time, you might find yourself in a Greenfield project. Where everyone is like the kid with a white wall and crayons. Everybody ends up putting their opinions to implementation and the codebase as a whole ends up a gigantic over engineered bowl of noodles. And then comes the tax.
Why the boredom with SDE?
The main source of boredom with software engineering for me is the business logic. There is really nothing in SDE except more and more business logic when you are the one coding it away. Day in day out, it’s always the same. You don’t really find the need of a new Algorithm or a whole new Design Pattern never seen by humanity. The element of discovery is very limited and often non existent. You break the customer problem down and you find Abstractions, REST APIs, Frontends and Backends. And in those components you find loops, functions, objects and events. They get turned into tickets and you get assigned to them.
The little excitements can mostly be found in using a new tool that forces a new pattern that causes a whole lot of refactoring. If you survive the onslaught of fresh new bugs to resolve, it’s a nice few weeks of adventure and we get back to the grind.
Yes there might be interesting and edgy moments for Solution Architects, Product Owners. After all, they are the ones who actually turn the human thoughts into machine compatible logic chunks. But I am quite far from reaching anywhere there yet.
So in essence, SDE means reading problems, breaking it into chunks that can ultimately be writeen using known patterns, following compliance and ensuring compatibility with the rest of the stack.
Realizing this inevitability, I took my first ticket out of SDE and into SRE land. Because SRE sounded like it was offering more than SDE while keeping the good parts.
Why SRE? Why not Management or BA or even non tech?
I still love computattion and coding. I still enjoy building and planning. I still do not feel comfortable managing people and telling them what to do. I prefer to share ideas and have discussions that lead to better planning. I still love engineering. SRE has all that and more.
SREs do more than just maintain availability. They continuously work towards improving availability. Most SRE roles contain a percentage of Software Engineering work as well. In this component, SREs build automations to tackle the ever growing complexity of maintaining availability of software that keeps growing.
Most of what I do as part of the Software Engineering component is introduce more automation to tackle what is called TOIL. I don’t introduce new Buttons, I introduce what a Button might use behind the scenes. If it is feasible to let a customer scale their cluster with a few inputs, we build the automation that is fed the input. If there is a process a customer should follow to achieve certain infrastructure goals by themselves, that’s what I work on and put on documentation.
As an SRE, I also do on-call shifts just like every other service oriented team. And the coverage is not always limited to products or services that my team owns. We look after a wider range of services and contact the owners when necessary.
Another part that we play is to enable other SREs through internal tools and services. We own such work and develop the features in those types of services as well. Finding root cause of issues is a common task we handle. And the root cause does not always originate in the work that we own.
One of the good things that we do as SREs is we actually put problems into documents and discuss approaches. We handle unknown on a daily basis so opinions are considered with an open mind. We continually work towards improving life for people that are on-call. We have to communicate a lot with people that handle customer requents. Communication is the driving force. All this provides great learning opportunities for someone like me. In SDE, I did not have such wide interation requirements. So the learning was very product centric. Whereas now, I am frequently out of my comfort zone.
In summary, we do a little bit of everything and know a little bit about everything. Sometimes we even attempt to read code we probably will never own in order to find and report issues. Software Engineering and Operations are both important parts of our role. And we get to play with a lot of infrastructure, network and system problems. Perfect.
Is SRE land all blue skies, green fields and flowers?
No. It is definitely more stressful than I had in SDE, especially in products. But I think it’s better if I use my growth phase now rather than later. At some point, we all grow in the role and it becomes easy. But life does not allow us to life out of the comfort zone all the time. At some point I will lose my ability to take on the stress of a new role. So I think it’s better if I do it now and settle in. And it’s not like I can never go back into SDE. After all, I still do write business logic, unit tests and end-to-end tests. I still go through release cycles. I still write documentation.