Acting Quickly To Resolve an Incident
Engineering Manager at Carta
Recently we had an internal incident that required a prompt response. Our long-standing customer notified us that they found a bug on our platform. They were under a tight deadline to deliver some documents to authorities by means of our platform. The bug disabled them to locate the documents, let alone to send them.
A couple of our engineers were tasked to investigate what happened and why. While it was a matter of the greatest urgency, they were stuck for a few hours, unable to understand what had happened. I was called in to help them out.
"Being calm and composed allowed me to coordinate our efforts in the right direction."
I had a strong guess what the problem was. Being calm and composed allowed me to coordinate our efforts in the right direction. I organized the team in no time; I had someone checking on one piece of the application, someone else looking at the logs while I was talking to our point of contact that was communicating with the client. At that moment, I understood that having the most accurate pieces of information was critical; the more accurate the information, the more precisely we could detect the problem. I instructed the team to report to me frequently about what was going on. After collecting enough information, I was confident that my original guess was correct. We were able to identify what happened to our customer’s documents and merely had to recover them.
By quickly resolving the incident, we managed to turn a disadvantageous situation in our favor and further strengthen the relationship with the client.
In addition, I made sure to turn this experience into institutional knowledge. In a situation of great urgency, when every second counts, one can’t often share what they are doing. But I seized the first opportunity after the incident was resolved to reach out to engineers who were originally tasked to fix the problem and offered to explain what I did step-by-step. Other people also joined and were appreciative of my efforts to make our actions the collective learning experience.
- By acting quickly to resolve an incident, one can turn the disadvantageous situation into a great opportunity. Not only did I fix the problem promptly without jeopardizing the relationship with the customer, but our rapid response and commitment to be at our customer’s service strengthen our relationship.
- I was able to share step-by-step what I did, so next time when someone on the team encounters the same problem, we can be even quicker to resolve the incident.
Connect and Learn with the Best Eng Leaders
We will send you a weekly newsletter with new mentors, circles, peer groups, content, webinars,bounties and free events.