Loading...

Implementing a major platform technology change

Chris Radcliffe

Software Leader and Mentor at Stealth Startup

Loading...

Problem

Our platform's full-text search was showing its age and its limits. Our fundamental business model of providing subscription access to e-books via a handful of large collections was being challenged, as customers now wanted to buy individual titles, and wanted to see the changes instantly. However, the existing mechanics of our platform meant that indexing individual titles took hours or even days, as the entire index of all documents had to be regenerated. We needed to improve our search subsystems so that they would allow near-instantaneous addition and search of new documents. And of course, we needed to do this without breaking the system, taking it down, or disenfranchising any customers or end-users. This was a bet-the-company change.

Actions taken

Our objectives were fairly clear, but detailed plans about how to actually achieve our goal were not. It wasn't as easy as just building a completely separate stack, as the content and historical user data had to be migrated in almost real-time, and many of the technologies we have today weren't available then. I started out by sponsoring a sequence of internal experiments on every aspect of what was going to change for customers and users with our Chief Architect, and formed a dedicated Core team to pursue them. This was the "managing down" part of the process. However, I also did some "managing up". It became very clear to me that we needed to broadcast beyond our normal mode of "feature release" communication to the company about the big changes that would be occurring, and we needed to provide upper management with choices about tradeoffs in terms of budget and resources. And, we felt we needed to communicate the technical risks we were facing and tell the story of how we would manage it.
Once we said exactly what we were going to do technologically, we also had to decide on how we were going to implement the changes. We couldn't just build another stack, and permanently switch both customers and users to it, since so much historical data needed to be migrated and transformed to support the features of the new platform. Instead, we engaged in what I refer to as "wing-walking". In the 1920s, there were lots of experiments and stunts with airplanes, as aviation was still a new technology. One popular mode was to walk across and through the airplane's wings while it was flying, sometimes to demonstrate the stability of the aircraft, sometimes just a daredevil stunt. The first rule of wing-walking is, "Don't let go of what you have a firm grip on until you have a firm grip on the next thing". For our project, we did things in an extremely methodical "wing walking" way, moving customers, technology and components, and we always gave them a way to go back. We had determined that we could run two complete systems in parallel. That meant building, maintaining and feeding all content into both of the platforms in parallel. However, while this allowed incremental migration of a customer's holdings to the new platform, it really constituted throwing a one-way switch in code with respect to recording new purchases and user interaction with their new bookshelves. Due to this, I fostered discussion and garnered acceptance of a set of mechanisms to migrate customers and users to back to the old system, even with the new data, in the event of a truly catastrophic fail. I personally did the research and experiments (SQL database work) on syncing user purchases and user bookshelf migration ( bi-directional). Finally, before releasing the new system, we practiced, practiced, practiced procedures, failback, monitoring and tested, tested, tested performance, accuracy, content ingest volume, and the new document search capability (instantaneous availability). All this I continually communicated to other leaders in the company, presented the story visually at all-hands meetings, and invited all to help test. Then we took action. There was no major rollback, and while there were a few problems with admin and prep we corrected these with each phase before moving to the next. After the initial experimental wave we "migrated" customers first by only sending their purchases and searches to the new subsystem, to affirm search stack resilience. Then we followed with business logic and user bookshelf migration. Within days we had a verified success. Most of the failbacks were never used, but we were glad to have them at hand.

Lessons learned

We were scared to death that we were going to risk the business, but inaction would have resulted in the same thing. By insisting on taking a very deliberate approach, rather than just rushing it through and "hopefully taking a couple of weeks" as the company initially believed it would take, we were able to successfully introduce our new technology. When you, as an engineering manager, are being asked to make a thing happen "for the company", you have to do a lot of legwork yourself to ensure that stakeholders really understand the consequences of what they're asking for. Often, they won't really want to know the details, but presenting a plan in terms of risk will immediately get their attention. However, it is not sufficient to discuss risk without a ready plan to address it. Then, what you are seeking is approval to pursue a plan your stakeholders understand. You can then move to complete the process with budget, manpower and timing adjustments to any previously set expectations. Engage with your stakeholders, early and often, and don't forget the first rule of wing-walking!


Be notified about next articles from Chris Radcliffe

Chris Radcliffe

Software Leader and Mentor at Stealth Startup


Engineering LeadershipCommunicationOrganizational StrategyDecision MakingCulture DevelopmentEngineering ManagementTechnical ExpertiseTechnical SkillsProgrammingSoftware Development

Connect and Learn with the Best Eng Leaders

We will send you a weekly newsletter with new mentors, circles, peer groups, content, webinars,bounties and free events.


Product

HomeCircles1-on-1 MentorshipBounties

© 2024 Plato. All rights reserved

LoginSign up