Acting Quickly To Resolve an Incident
9 April, 2021

Engineering Manager at Carta
Problem
Recently we had an internal incident that required a prompt response. Our long-standing customer notified us that they found a bug on our platform. They were under a tight deadline to deliver some documents to authorities by means of our platform. The bug disabled them to locate the documents, let alone to send them.
Actions taken
A couple of our engineers were tasked to investigate what happened and why. While it was a matter of the greatest urgency, they were stuck for a few hours, unable to understand what had happened. I was called in to help them out.
I had a strong guess what the problem was. Being calm and composed allowed me to coordinate our efforts in the right direction. I organized the team in no time; I had someone checking on one piece of the application, someone else looking at the logs while I was talking to our point of contact that was communicating with the client. At that moment, I understood that having the most accurate pieces of information was critical; the more accurate the information, the more precisely we could detect the problem. I instructed the team to report to me frequently about what was going on. After collecting enough information, I was confident that my original guess was correct. We were able to identify what happened to our customer’s documents and merely had to recover them.
By quickly resolving the incident, we managed to turn a disadvantageous situation in our favor and further strengthen the relationship with the client.
In addition, I made sure to turn this experience into institutional knowledge. In a situation of great urgency, when every second counts, one can’t often share what they are doing. But I seized the first opportunity after the incident was resolved to reach out to engineers who were originally tasked to fix the problem and offered to explain what I did step-by-step. Other people also joined and were appreciative of my efforts to make our actions the collective learning experience.
Lessons learned
- By acting quickly to resolve an incident, one can turn the disadvantageous situation into a great opportunity. Not only did I fix the problem promptly without jeopardizing the relationship with the customer, but our rapid response and commitment to be at our customer’s service strengthen our relationship.
- I was able to share step-by-step what I did, so next time when someone on the team encounters the same problem, we can be even quicker to resolve the incident.
Discover Plato
Scale your coaching effort for your engineering and product teams
Develop yourself to become a stronger engineering / product leader
Related stories
20 January
As a Lead or Manager, one could naturally incline more towards being either people oriented or task oriented. Which is better? Do you know which side you lean more towards?

Kamal Raj Guptha R
Engineering Manager at Jeavio
10 December
Supporting principles on why being data led (not driven) helps with the story telling.
Vikash Chhaganlal
Head of Engineering at Xero
5 December
Your Org Team may as well be a Sports team. Let's explore how this cohesive, multi-skilled team can be optimized for Great Group Playoff.

Jaroslav Pantsjoha
Google Cloud Practice lead at Contino
29 November
Why DevSecOps matter and what's really in it for you, the team and the organisation?
Vikash Chhaganlal
Head of Engineering at Xero
28 November
The impact you can have with a Growth Mindset' and the factors involved in driving orchestrated change.
Vikash Chhaganlal
Head of Engineering at Xero