Shipping Cloud-Native Software With a Focus on Sustainable Operability
4 August, 2021

Senior Engineering Manager at Splunk
Problem
I have run multiple service teams across different parts of the company for a few years now and it always comes down to the cost of operating services. You very quickly realize that it’s not only about developing the features of the software but also about operating it sustainably for years to come. Teams can occasionally lean towards selecting for near-term benefits versus long-term sustainable operability. Naturally, when teams grow to own multiple services, they end up being very different from one another as a part of the growing company.
Services of different vintages can have very different technologies and frameworks that support the functionality. The differences in the coding frameworks, testing frameworks, ops dashboard, and troubleshooting tools add up over time. As a result, scaling problems become very evident. Teams also had to reimplement some features across the multiple services, and there were numerous code silos across them. What happened was there were pockets of code that only a few engineers were experts in, and nobody else knew how that really worked.
Actions taken
The first solution for me was to double down and invest in nonfunctional engineering requirements that power sustainable operability. I started creating services as a top-level line item that every team should be responsible for. The teams were no longer accountable for developing and shipping features the way product and engineering managers had requested. They were also responsible for operating their infrastructure on their own services.
I encouraged my team to hold strong opinions on their tech stacks and coach an approach that favors operating, scaling, and being accountable for services in addition to exciting shipping features. I knew that everyone loves to ship exciting features, but that is not everything we had to do as engineers. Hence, if my team felt strongly that a different tool or technology would work, they were free to choose it with logical reasoning.
Lastly, I fostered a culture of sustainability operating back-end services that pays rich dividends. There were a bunch of cool technologies hiding behind our products, but we wanted to innovate and shape the future. We liked everything to be processed in a sustainable way.
Lessons learned
- Be thoughtful about how peripheral systems are designed and architected. Not all of them would deliver primary value to users, so you need to be mindful of that.
- Hire and recognize talents that treat build systems, testing frameworks, SRE best practices, debugging, and troubleshooting tools, all with the same enthusiasm. Everyone is excited about shipping user-facing features, but the others are also important.
- Grow the collective strength of the team in the direction that benefits them over the longer cycle.
- Your organization’s culture decides whether stakeholder management and positioning non-functional engineering work are going to be easy or challenging. It would be best if you were flexible around this.
Discover Plato
Scale your coaching effort for your engineering and product teams
Develop yourself to become a stronger engineering / product leader
Related stories
16 May
Snehal Shaha, Lead Technical Program Manager at Momentive (fka SurveyMonkey), details her short-term technical strategy to unify processes among teams following an acquisition.

Snehal Shaha
Senior EPM/TPM at Apple Inc.
9 May
Pavel Safarik, Head of Product at ROI Hunter, shares his insights on how to deal with disagreements about prioritization when building a product.

Pavel Safarik
Head of Product at ROI Hunter
4 May
Kamal Qadri, Senior Manager at FICO, drives the importance of setting expectations when optimizing large-scale requirements.

Kamal Qadri
Head of Software Quality Assurance at FICO
25 April
Matias Pizarro, CTO and VP of Residents at ComunidadFeliz, recalls a time in his early career when he took a technology risk that had wide-ranging benefits to his product's user experience.

Matias Pizarro
CTO and VP of Residents at ComunidadFeliz
14 April
Suryakant Mutnal, Engineering Manager at PayPal, discusses the importance of time management and the necessary structures in order to create internal consistency.

Suryakant Mutnal
Engineering manager at PayPal
You're a great engineer.
Become a great engineering leader.
Plato (platohq.com) is the world's biggest mentorship platform for engineering managers & product managers. We've curated a community of mentors who are the tech industry's best engineering & product leaders from companies like Facebook, Lyft, Slack, Airbnb, Gusto, and more.
