Loading...

Managing Tech Debt: A Success Story

Beier Cai

Co-founder and CTO at Commit

Loading...

Problem

In the early days, we iterated fast, which accumulated significant technical debt. Some of it was intentional, some certainly not. As I transitioned to manage multiple teams, I realized how tech debt slowed us down and impeded team members who were fighting it. At that point, I could witness a piece of code getting ramified into thousands of lines of code, which was perplexing to engineers and prevented them from changing any functionality. Most engineers were afraid to come nowhere near these pieces. Also, tech debt affected our ability to scale. The system was breaking all the time. People would be paged in the middle of the night because the traffic in Europe or Japan would be peaking up, and we couldn’t handle the load.

Most importantly, tech debt was hammering the team, impacting their ability to ship features. We didn’t have a proper mechanism to deal with tech debt, so people working on features would spend a couple of weeks completing their tasks instead of days. Finally, because of our inaction in managing tech debt, we noticed that people started to fix it randomly -- and without almost any visibility -- rather than focusing on work they were assigned to.

Actions taken

In essence, I knew we needed to have an overall tech debt strategy that would make things more transparent. We had to surface the problem of tech debt; it couldn’t stay hidden anymore. Then, it has to be properly prioritized as we could fix everything at once. Furthermore, we had to create an environment for the team to address tech debt more proactively and continuously. Enough time should be allocated to manage tech debt, and measuring success should be a priority from the start. In the end, we should be able to communicate with stakeholders how much debt has been solved or how much velocity has increased, etc. We broke this process down into smaller steps and took a step at a time.

First and foremost, I initiated a series of conversations with stakeholders, convincing them that tech debt is real. We had to face it or bear the consequences of denial. I had to ensure that all stakeholders were willing to acknowledge the problem and understand the need to solve it. I believed we should do features and tech debt planning together from the start since I wanted to ensure full visibility and alignment by involving all stakeholders.

Then I deep-dived into a voluminous literature to understand how different people addressed the problem. I came across Steve Garnett’s blog and his Strategies for Technical Debt Prioritization that developed the tech debt prioritization strategy based on various criteria. The first criterion dissects how frequently tech debt happens. For example, if tech debt is hidden in a large function and people need to use it often, it is obviously a problem. But if the function runs by itself and no one needs to touch it for months, then you should direct your focus elsewhere. The amount of effort a team needs to spend fixing it is another criterion: is it a quick fix or a large investment? The shorter the amount, the sooner it should be fixed. Sometimes a large effort can be broken down into smaller pieces. The third criterion evaluates both the customer and engineering efficiency impact once the debt is fixed. In terms of the customer impact, you should ask yourself how fixing tech debt would improve reliability or usability for the customer and how significantly. If not for the customer, does it have an impact on engineering velocity?

Next, I had to enable the whole team to service various pieces of technical debt. It should be more a bottom-up, then top-down effort, where the team could surface it on Jira board or GitHub. Once surfaced as a ticket, I would work with a team to measure frequency, impact, and efficiency. The best way to approach it is by assigning all pieces a score, and those with a high score would become priorities. All of that, I had to communicate with the team and make sure they understood and aligned on these high-priority items.

The amount of time allocated to the team to fix tech debt is always a subject of debate. We allocated 25 percent of the time in each sprint, making sure that we had enough time to tackle it. One of the things I learned was not to be too attached to a specific percentage. Sometimes urgency would appear from the product side, sometimes from the technical side; therefore, I tried to be flexible on a week-by-week basis. I was committed to keeping it 25 percent on the annual average, then sticking to the number on a weekly basis.

Finally, I had to create a mechanism that would enable us to measure success. We started to track how much tech debt we could fix over time. That became a number that didn’t tell us much. We used a number of tickets to measure success. During some sprints, we addressed five or ten tech debts, during the other one or two. It would fluctuate a lot without telling us why and what was the impact of fixing that tech debt. Later we started to track our engineering velocity too. If we could improve tech debt, we would see improvement of engineering velocity reflecting in a number of story points.

Lessons learned

  • Addressing tech debt is a collaborative effort that should coalesce top-down and bottom-up approaches. A top-down approach will -- by means of a well-defined strategy -- ensure the alignment of stakeholders. But day-to-day execution should be bottom-up. Make sure you secure buy-in from the team. The team should have a say in how tech - debt should be addressed and prioritized.
  • When we started servicing tech debt issues, we created a tech debt backlog. There were hundreds of items, which scared people a great deal. The sheer amount of items had a devastating psychological effect on engineers. We learned later that half of the issues were unimportant or we would never be able to address them, so we just got rid of them. I was driven by the belief that people should focus on what mattered the most. I was concerned that we would never know if we got rid of something important by discarding some of the issues. But, my logic was that if it is something important, it will come back again.

Be notified about next articles from Beier Cai

Beier Cai

Co-founder and CTO at Commit


Engineering LeadershipLeadership DevelopmentCommunicationOrganizational StrategyDecision MakingCulture DevelopmentEngineering ManagementSprint CadencePerformance Metrics

Connect and Learn with the Best Eng Leaders

We will send you a weekly newsletter with new mentors, circles, peer groups, content, webinars,bounties and free events.


Product

HomeCircles1-on-1 MentorshipBountiesBecome a mentor

© 2024 Plato. All rights reserved

LoginSign up