Managing Customer Releases with Feature Flags instead of Branches
CTO and Founder at Blend
Blend provides a white-labeled consumer lending platform that streamlines the otherwise manual, paper-based, and generally painful borrowing process. One challenge inherent in our business model and industry is that different customers need different functionality and can accept change at different rates. Some customers want the latest functionality as soon as it's available, while others prefer to test every user-facing change in our beta environment for a month or more before allowing it to be promoted to production.
We considered two approaches to this problem.
One approach was to deploy an instance of our core service for each customer, maintaining separate branches as needed to control which functionality was present. This would keep the code cleaner (on a given branch) and not require any new tooling or frameworks. On the other hand, it would make debugging more difficult since different customers would be on different versions with as much as a month of skew among them. It would also make it necessary to manage a linearly growing set of instances of the service, with a nontrivial setup time for each additional instance. It would require us to maintain a large number of branches in production, making continuous delivery basically impossible.
The alternative was to deploy a single version of code for all customers, but control functionality differences using feature flags. Deploying a single version of code in production would keep debugging and code deployment simple since the team would only have to know about and understand a single recent version of code. It would also make it easy and instantaneous to revert changes that cause problems. The downside is that it would make the code more complex and branchy (each feature flag introduces at least one conditional), and would require new tools to manage flag state and scheduling. Finally, this approach would make it more difficult to fully customize anything at a given customer, which can be very useful in the short term, but is not as scalable in the long run.
"Deploying a single version of code in production would keep debugging and code deployment simple since the team would only have to know about and understand a single recent version of code."
My co-founders and I experienced the "branch per customer" approach first-hand at our previous company. Because many of our customers were not willing to use cloud-based services at the time (~2008-2011), we typically hosted the application in customer data centers. This structure permitted us to deploy a different version to each customer instance, which made it more difficult to upgrade customers to the latest version — every live branch had to have the latest changes merged in, and not every customer was upgraded at the same time. We shipped a new version about once a month. As an engineer, I remember the pain of having to figure out how the code worked a month ago on the version that that customer happened to be on in order to debug.
Because of this experience, my co-founders and I have always been adamant about hosting Blend in the cloud, and we had to overcome the objections of many early prospects to proceed with this. Among a multitude of other benefits, cloud hosting has made it unnecessary to host anything in customer data centers, allowing us to consider the feature flagging approach. This approach seemed like a better solution overall, so we went with it.
Today we have almost 200 feature flags in production. We've scaled our ability to manage it using our "Configuration Center" UI, which allows flags to be controlled for cohorts of customers and automatically scheduled for promotion.
The feature flagging approach has proven to be the right decision. It has scaled past 100 customers so far and allowed us to continue upgrading our core service relatively frequently (~daily) and in a highly automated fashion. Engineers only have to understand a small constant number of code versions at a given time.
While a few bugs have been caused by unanticipated, untested interactions between flags, this has not been a major issue by-and-large. Still, it is simpler to deal with a smaller number of flag configuration sets — in other words, try to have as many customers as possible on the same settings. To this end, we have a small, constant number of customer cohorts that share the same settings.
Three unanticipated classes of what we call "dead feature flags" have come about:
- On-everywhere flags: Flags that are enabled everywhere, but linger in the code
- Off-everywhere flags: Flags that have been in the code for months, but are still not enabled anywhere
- Custom flags: Flags that are only ever enabled at one customer, or that are enabled at all but one customer
The number of on-everywhere flags tends to grow because pods do not necessarily prioritize their removal immediately.
Off-everywhere flags come about in several cases:
- A feature is started but deprioritized.
- A feature is worked on for a long time behind a single flag. This is not ideal because it means that the change is not being released in production iteratively.
- A feature is finished, but no customer wants to enable it. The flag and the code it controls are kept out of optimism that customers will want it at some point.
The number of custom flags grows because of the permanently unique needs of certain customers. In these cases, the flag needs to be converted to a permanent configuration setting, or we need to work with the customer to remove the need for customization.
Everyone benefits from the deletion of unnecessary code, so we encourage pods to clean up after themselves in common codebases. We've been able to do this effectively using the Technical Health Pod.
Connect and Learn with the Best Eng Leaders
We will send you a weekly newsletter with new mentors, circles, peer groups, content, webinars,bounties and free events.