Managing feature toggles in teams
Part one
Published: August 26, 2020
For its advocates, trunk-based development (TBD) is seen as preferable to feature branches because it makes Continuous Integration easier and reduces the chance of painful merge conflicts. Despite its advantages, TBD introduces its own challenges. When all code is in the main branch, unfinished or untested features can intermingle with finished ones, preventing the deployment of otherwise completed work. Feature toggles, specifically release toggles, promise to address this, allowing us to hide unfinished features from users by toggling them off, removing barriers to deployment and enabling Continuous Delivery. While they’re integral to many teams' CD approaches, feature toggles come with associated costs: added complexity in the code and the overhead in the workflow for adding and removing them.
Over the last few years, some teams we’ve worked with experienced projects that suffered from only having the toggles’ costs but failed to get the anticipated benefits. These teams ended up being reticent about the use of toggles and maintaining them longer than necessary. In other projects however, we saw toggles being utilized in a completely different way that delivered all of the benefits, minimized the overhead, and even helped in structuring the team’s priorities.
We reflected on these different approaches to feature toggles. In this article, we’ll examine six key attributes that set effective uses of feature toggles apart from ineffective ones.
If toggles can be flipped in different environments, this can enable product owners and stakeholders to easily demo and explore new features before they’re made visible to users in production.
The first step for every story could be to create the toggle, even if it’s not hiding anything at that point in time. The second step could then be to flip it on in any non-live environment and off in the production environment. This way, changes impacting or colliding with other features under development are noticed early on but cause no problems in the live environment.
Toggles in non-live environments should only be flipped off if they’re causing major issues that risk or slow down other development or if the feature behind the toggle is being demonstrated. The toggle in the live environment is only flipped on once the story has been signed off successfully.
Monitoring the state of toggles can help build another layer to detect potential problems. Toggles that aren’t enabled on every non-live environment or that are enabled on live, should be acknowledged. The first case needs to be checked for bigger problems, the latter toggles can probably be removed.
When feature toggles are used consistently for every story, they provide an additional, quick visualization of the feature-development status in the team.
Example: Two stories are being worked on that enable ‘sort by date’ and ‘sort by price’ respectively. At first glance, it might feel intuitive to create one toggle for sorting and hide their feature behind it. However, if one of the stories is done but the other one is more complex and still needs time, the toggle is now hiding the feature of the finished story. If this toggle was enabled on live, it might leak the unfinished feature.
Simple, per story feature toggles enable one of the key benefits of using toggles: independent development and deployment. Furthermore, the simpler a feature toggle is, the easier it becomes to clean it up once it is no longer needed.
Feature toggles can be considered deliberate tech debt, so making this debt visible is the first step towards managing it.
Visible feature toggles enable team awareness about the amount of features currently in progress.
Using a ‘work In progress’ limit is a common practice to improve the throughput of development teams. In addition, or even as a replacement, a ‘toggles in existence’ limit can be introduced, which forces the team to remove old toggles before a new one is created. This builds on the visibility of toggles to make this tech debt visible and push the team to actively manage it.
Furthermore, once a story is signed off, it can be moved to a ‘toggled on’ column instead of ‘done’, which it only leaves when its toggle is deleted. This helps emphasize that a feature isn’t truly ’done’ until we’ve cleaned up the tech debt we created along the way. We can also attach an ‘expiry date’ to the toggle, allowing us to observe the feature ‘in the wild’ before we remove the toggle. The timing should be quite short, but a period of observation can help build confidence in new features.
Keeping feature toggles short-lived reduces the technical debt that might pile up otherwise.
One strategy to avoid this confusion is to enable the feature toggle by default for all tests when adding a new feature. In practice this meant that when an old test failed, we copied the test — we kept the original one but switched off the toggle there and marked it as a legacy test; the copy, we either adjusted for the new expected behavior or decided while developing, that it became unnecessary.
When cleaning up the toggle, we would remove all legacy tests, so those that were setting the toggle to inactive for their test scenario. This way, after cleaning up, only tests using the new feature or tests unaffected by the new feature would remain.
Writing and updating tests in this way enables an easy clean-up of feature toggles such that they can be removed quickly and without needing to have much context.
There are many ways to tweak toggles to your team’s needs and improve even more. Take the time and discuss conventions, how you are using toggles and what you are expecting when you build a feature.
This article also exists as a video presentation here.
Over the last few years, some teams we’ve worked with experienced projects that suffered from only having the toggles’ costs but failed to get the anticipated benefits. These teams ended up being reticent about the use of toggles and maintaining them longer than necessary. In other projects however, we saw toggles being utilized in a completely different way that delivered all of the benefits, minimized the overhead, and even helped in structuring the team’s priorities.
We reflected on these different approaches to feature toggles. In this article, we’ll examine six key attributes that set effective uses of feature toggles apart from ineffective ones.
Feature toggles should be flippable
There are many light- and heavyweight approaches to implementing feature toggles. Some store values in a database table and offer a UI to flip them; others just put them into environment variables. Whatever approach you choose, it should always be a matter of seconds to flip your toggles. Toggles hard-coded in an application’s configuration aren’t very valuable, as it may take a whole tour through the pipeline to flip them.If toggles can be flipped in different environments, this can enable product owners and stakeholders to easily demo and explore new features before they’re made visible to users in production.
Feature toggles should be used by default
One of the key problems with feature toggles is when they are simply not used. This may be because the decision of adding a toggle isn’t discussed when kicking off a story or toggles are only added if there’s an explicit argument being made in favor of this particular toggle. Toggles should however be the default for every story. The only cases where toggles shouldn’t be created are for very minor issues — like small bugs or UI fixes. You should aim to reach a convention within your team on when toggles are necessary, taking into account your team’s structure and needs.The first step for every story could be to create the toggle, even if it’s not hiding anything at that point in time. The second step could then be to flip it on in any non-live environment and off in the production environment. This way, changes impacting or colliding with other features under development are noticed early on but cause no problems in the live environment.
Toggles in non-live environments should only be flipped off if they’re causing major issues that risk or slow down other development or if the feature behind the toggle is being demonstrated. The toggle in the live environment is only flipped on once the story has been signed off successfully.
Monitoring the state of toggles can help build another layer to detect potential problems. Toggles that aren’t enabled on every non-live environment or that are enabled on live, should be acknowledged. The first case needs to be checked for bigger problems, the latter toggles can probably be removed.
When feature toggles are used consistently for every story, they provide an additional, quick visualization of the feature-development status in the team.
Feature toggles should be added per story
Stick to the rule to have one toggle per story and not reuse toggles. Failure to do so might lead to catastrophic failures. Even without considering huge disasters, per story feature toggles give you and your team many advantages in the development.Example: Two stories are being worked on that enable ‘sort by date’ and ‘sort by price’ respectively. At first glance, it might feel intuitive to create one toggle for sorting and hide their feature behind it. However, if one of the stories is done but the other one is more complex and still needs time, the toggle is now hiding the feature of the finished story. If this toggle was enabled on live, it might leak the unfinished feature.
Figure 1: How two related stories should be toggled
Simple, per story feature toggles enable one of the key benefits of using toggles: independent development and deployment. Furthermore, the simpler a feature toggle is, the easier it becomes to clean it up once it is no longer needed.
Feature toggles should be visible
A problem can be that team members picking up a new story aren’t aware of existing feature toggles. By simply adding a note to the corresponding story card with the name of the toggle, toggles gain visibility and all team members know which toggles they need to flip in case a feature should be turned on or off for a given situation.Figure 2: Toggles can improve teams' visibility over progress of features
Feature toggles can be considered deliberate tech debt, so making this debt visible is the first step towards managing it.
Visible feature toggles enable team awareness about the amount of features currently in progress.
Feature toggles should be short-lived
As with toggle creation, the decision of when to remove toggles should be a team-wide convention. In past projects, we’ve seen the removal of toggles being tracked with ‘remove toggle X’ story cards that were put in ‘ready for development’ once a story with a toggle was signed off. Experience showed that these cards would rarely be prioritized and toggles remained in the codebase much longer than they should, slowing future development as developers worked around them.Using a ‘work In progress’ limit is a common practice to improve the throughput of development teams. In addition, or even as a replacement, a ‘toggles in existence’ limit can be introduced, which forces the team to remove old toggles before a new one is created. This builds on the visibility of toggles to make this tech debt visible and push the team to actively manage it.
Furthermore, once a story is signed off, it can be moved to a ‘toggled on’ column instead of ‘done’, which it only leaves when its toggle is deleted. This helps emphasize that a feature isn’t truly ’done’ until we’ve cleaned up the tech debt we created along the way. We can also attach an ‘expiry date’ to the toggle, allowing us to observe the feature ‘in the wild’ before we remove the toggle. The timing should be quite short, but a period of observation can help build confidence in new features.
Keeping feature toggles short-lived reduces the technical debt that might pile up otherwise.
Figure 3: Visibility and expiration dates of toggles can help reduce the build-up of tech debt
Feature toggles should be tested
One challenge when working with feature toggles is testing. While a toggle exists, we need to maintain tests for both the old and the new functionality, which can mean duplicated code and confusion for developers who need to clean up those tests when removing the toggle.One strategy to avoid this confusion is to enable the feature toggle by default for all tests when adding a new feature. In practice this meant that when an old test failed, we copied the test — we kept the original one but switched off the toggle there and marked it as a legacy test; the copy, we either adjusted for the new expected behavior or decided while developing, that it became unnecessary.
When cleaning up the toggle, we would remove all legacy tests, so those that were setting the toggle to inactive for their test scenario. This way, after cleaning up, only tests using the new feature or tests unaffected by the new feature would remain.
Writing and updating tests in this way enables an easy clean-up of feature toggles such that they can be removed quickly and without needing to have much context.
Outlook
These principles for creating, managing and cleaning up feature toggles helped our past project teams to become way more flexible and faster. The latest edition of the Thoughtworks TechRadar highlights the Simplest possible feature toggle and these principles are how we implemented this technique.There are many ways to tweak toggles to your team’s needs and improve even more. Take the time and discuss conventions, how you are using toggles and what you are expecting when you build a feature.
This article also exists as a video presentation here.
Read the rest of this series:
- Part two: The limits of feature toggles
- Part three: Feature toggles and database migrations
- Part four: Testing feature toggles
- Part five: Static vs dynamic feature toggles
Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Thoughtworks.