Shipping Cloud-Native Software With a Focus on Sustainable Operability
4 August, 2021

Nikhil Mungel
Senior Engineering Manager at Splunk
Problem
I have run multiple service teams across different parts of the company for a few years now and it always comes down to the cost of operating services. You very quickly realize that it’s not only about developing the features of the software but also about operating it sustainably for years to come. Teams can occasionally lean towards selecting for near-term benefits versus long-term sustainable operability. Naturally, when teams grow to own multiple services, they end up being very different from one another as a part of the growing company.
Services of different vintages can have very different technologies and frameworks that support the functionality. The differences in the coding frameworks, testing frameworks, ops dashboard, and troubleshooting tools add up over time. As a result, scaling problems become very evident. Teams also had to reimplement some features across the multiple services, and there were numerous code silos across them. What happened was there were pockets of code that only a few engineers were experts in, and nobody else knew how that really worked.
Actions taken
The first solution for me was to double down and invest in nonfunctional engineering requirements that power sustainable operability. I started creating services as a top-level line item that every team should be responsible for. The teams were no longer accountable for developing and shipping features the way product and engineering managers had requested. They were also responsible for operating their infrastructure on their own services.
I encouraged my team to hold strong opinions on their tech stacks and coach an approach that favors operating, scaling, and being accountable for services in addition to exciting shipping features. I knew that everyone loves to ship exciting features, but that is not everything we had to do as engineers. Hence, if my team felt strongly that a different tool or technology would work, they were free to choose it with logical reasoning.
Lastly, I fostered a culture of sustainability operating back-end services that pays rich dividends. There were a bunch of cool technologies hiding behind our products, but we wanted to innovate and shape the future. We liked everything to be processed in a sustainable way.
Lessons learned
- Be thoughtful about how peripheral systems are designed and architected. Not all of them would deliver primary value to users, so you need to be mindful of that.
- Hire and recognize talents that treat build systems, testing frameworks, SRE best practices, debugging, and troubleshooting tools, all with the same enthusiasm. Everyone is excited about shipping user-facing features, but the others are also important.
- Grow the collective strength of the team in the direction that benefits them over the longer cycle.
- Your organization’s culture decides whether stakeholder management and positioning non-functional engineering work are going to be easy or challenging. It would be best if you were flexible around this.
Discover Plato
Scale your coaching effort for your engineering and product teams
Develop yourself to become a stronger engineering / product leader
Related stories
6 February
Internal Hackathons invite team spirit and collaboration which are critical whether an engineering org is co-located or operating remotely spread across 20 times zones. Hackathons give employees the opportunity to connect and network while they solve fun & relevant challenges.

Balki Kodarapu
Senior Director of Engineering at SupportLogic
5 December
Your Org Team may as well be a Sports team. Let's explore how this cohesive, multi-skilled team can be optimized for Great Group Playoff.

Jaroslav Pantsjoha
Google Cloud Practice lead at Contino
30 November
When you grow fast, its normal to focus on Value delivery aka "Feature Releases". Too many releases too soon will inevitably lead to piling tech debts and before you know, inefficiencies creep in, performances goes down, and ultimately any new release takes too long. Sounds familiar? Then read on..

Ramkumar Sundarakalatharan
VP - Engineering at ITILITE Technologies
25 October
Mrunal Kapade, an Engineering leader, based in Silicon Valley, shares tips that helped reduce attrition in the remote engineering teams while leading multiple teams from startups to Fortune 500 companies.

Mrunal Kapade
Director of Engineering at Inspire Energy
14 October
There are nine specific building blocks and functional areas every org/company need to work to launch the product and provide services to customers. How effectively founders tackle them determine the destiny of the company.

Praveen Cheruvu
Senior Software Engineering Manager at Anaplan