In the first installment of this three part series, we delve into the essence of the Strangler Fig pattern, exploring its mechanics and the rationale behind its adoption for legacy system modernization. Through a real-world case study on the modernization of a coupon system for a retailer, we illustrate the practical application and benefits of this pattern, shedding light on its pros and cons in a tangible scenario. In part two, we will shift our focus to the human and procedural aspects of employing this pattern, while part three will tackle the technical challenges involved.
Introduction
The journey of modernizing legacy systems can often feel like navigating through a dense forest of old, entangled roots and branches, with a quest to reach the sunlit canopy of modern, efficient and scalable technology. Conventionally, some organizations may consider a “big bang” approach — completely replacing the old system with a new one in one fell swoop. However, the big bang approach can be akin to chopping down the entire forest and starting anew — it’s disruptive, fraught with risk, and can be an immense strain on people and resources. It's no surprise then that it's relatively uncommon for organizations to adopt this method for significant system overhauls.
In search of a more manageable alternative, many organizations are turning towards a pattern inspired by nature: the Strangler Fig pattern. This approach, named after the Strangler Fig tree (coined by Martin Fowler), gradually envelops a host tree and represents a gentler, iterative process. Just as the Strangler Fig grows over time, replacing the host from within, this approach involves building a new system around the edges of the old, gradually replacing it piece by piece.
This article explores the Strangler Fig pattern — a viable alternative to the big bang approach. We delve into its principles, how it mitigates risk and how leveraging it can effectively modernize legacy systems, ensuring that organizations can reach the sunlit canopy of technological advancement without clearing the forest.
The mechanics of the Strangler Fig pattern
Conceptually, here is how the Strangler Fig pattern works, step by step:
Identify system boundaries: Start by identifying the boundaries of the existing system you want to replace. This could be an entire application or a smaller subsystem within a larger application.
Define thin slices: Break down the system into manageable parts or "thin slices," which are small enough to be replaced incrementally but significant enough to deliver business value. These slices should be independent and self-contained where possible.
Introduce an indirection layer: This layer will act as a software seam that enables the introduction of new components in a manner that is transparent to the consumers. This could be a simple switch or a more complex routing mechanism, depending on the nature of the system.
Develop new components: For each slice, develop a new component that replicates the functionality of the old one but with modern technologies and practices. It may also be required to augment this component with additional functionality.
Route traffic: Once a new component is ready, implement a mechanism to route traffic to it through the indirection layer instead of the old component.
Retire old components: As new components take over, the old components they replace become redundant and can be retired. After replacing all the components of the old system, the entire system is ready for decommissioning.
Iterate: The Strangler Fig approach is iterative, meaning successful migration occurs after repeating the process of developing new components, rerouting traffic and retiring old components.
By gradually replacing the old system, the Strangler Fig approach reduces the risks associated with system migration, minimizes disruption and allows for the continuous delivery of value throughout the process.
The pros and cons of the Strangler Fig pattern
The Strangler Fig pattern offers several advantages when it comes to replacing legacy systems, especially those that are large, complex and critical to ongoing operations:
Incremental Change: The Strangler Fig pattern allows for the gradual replacement of the old system, reducing risk and making the process more manageable.
Continuous operation: Because the system is replaced piece by piece, the legacy system can continue to operate during the transition — this enables business continuity and minimizes disruption.
Reduced risk: There's reduced risk associated with each change by breaking the replacement process into smaller, incremental changes. Issues are easier to diagnose and rectify thanks to their confinement to a smaller part of the system.
Flexibility: The incremental nature of the Strangler Fig pattern also allows for flexibility. If business needs change or if a certain approach isn't working, it's easier to change direction without losing a large amount of work.
Learning and Improvement: Working incrementally allows the team to learn from each step, refine their approach, and improve over time — this can serve as evidence to prove the value of the effort without incurring all the spend upfront, eventually leading to better results.
Phased allocation: The allocation of people and resources can occur in phases thanks to the incrementally built nature of the new system, which can be more manageable than allocating them for a complete overhaul.
While the Strangler Fig approach has many benefits, it can also present certain challenges or become problematic in some scenarios:
Complex interdependencies: Legacy systems often have complex, deeply embedded interdependencies. Disentangling these to replace one piece at a time can be very challenging.
Long transition period: The Strangler Fig approach is incremental and can extend the transition period, which could lead to prolonged costs and potential complexities in managing two systems simultaneously.
People allocation: Balancing people and resources between supporting the legacy system and developing the new system can be complex. It may also be challenging to find or train personnel to work with old and new technologies.
Resistance to change: Like any significant change, there can be resistance from users or stakeholders, which can slow down or complicate the process.
Data synchronization: During the transition period, the same data may need to be kept up to date in both the legacy and new system, which can be complex and error-prone.
Multiple system overhead: There's a risk that due to changing business priorities, changing stakeholders or other issues, the process stalls before the legacy system achieves complete replacement, resulting in an incomplete migration. This can cause a hybrid system, which may be more complex and more challenging to maintain than either a full legacy or a fully modernized system.
Case study: Modernizing a coupon system for a retailer
Note: We’ve included a relatively simple case study here, keeping in mind the length of the article. If you want to learn about a more complex case study, please check out my talk titled Distributed Event-Driven Services from the Trenches at the O'Reilly Software Architecture Conference. You can get even more details here in Part 1 and Part 2 of my talks at the AxonIQ Conference.
For grocery retailers, coupon functionality plays a pivotal role in several aspects of the business. Coupons serve as a key marketing tool, helping to attract new customers, retain existing ones and drive sales of specific products. They can stimulate increased purchasing volume and encourage consumers to try new or less popular items. Additionally, coupons can also aid in inventory management, helping to clear out stock before it expires or becomes obsolete. Therefore, having a robust and efficient system for managing coupons is critical.
The coupon management system at this large grocery retailer we were working with was plagued with various issues. These included inaccurate validations that led to unauthorized discounts, a lack of scalability that resulted in system overloads during peak periods and inflexible rules that obstructed the creation of custom offers. Furthermore, poor integration with other systems resulted in inconsistent data, and an absence of analytics missed potential insights into effective marketing strategies. Despite these issues, the retailer still relied heavily on the legacy coupon management system. Its critical role in driving sales and managing customer relationships meant that it remained a significant part of the retailer's operations, underscoring the urgent need for system improvement and modernization.
At a high level, the legacy implementation provided four RPC-style endpoints that suffered from several issues:
Endpoint in legacy system |
Functionality | Issues |
---|---|---|
/get_coupons |
Fetches all coupons. Used by default and for unauthenticated users. |
Non-deterministic: Returned unsorted results in a seemingly random order. |
/clip_coupon | Allows a user to clip a coupon. |
Poor analytics: Inability to target customers with more relevant coupons. |
/unclip_coupon | Allows a user to unclip a coupon. |
As above. |
/get_my_coupons |
Get coupons clipped by a specific user. | As above. |
Issues with legacy coupons management system
To make matters worse, the legacy system had little to no documentation and automated tests, making it quite challenging to make even the smallest of changes. To minimize risk, we decided to take an iterative approach. We outline some of the key decisions here:
Building a solid foundation of understanding: To ensure a successful system transition, the team prioritized building a thorough understanding of the existing system's functionality, which involved multiple discussions with Subject Matter Experts (SMEs) and creating automated functional tests to treat the system as a black box. These tests provided valuable insights into the system's behavior. Additionally, we developed automated performance tests to understand non-functional requirements in detail. Log statements were strategically introduced into critical areas using aspect-oriented programming1, creating a centralized logging module. This low-risk approach allowed for seamless integration without altering the legacy codebase. The team gained valuable insights into the system's inner workings, making informed decisions while implementing the new solution and solidifying the transition's success.
Picking the first slice: It was time to pick that all-important first thin slice. We chose the /get_coupons functionality because it was heavily used and at the same time, was not as complex as the other pieces. This approach allowed us to strike a balance between addressing pressing needs and ensuring a smooth modernization process, setting the stage for future enhancements while delivering incremental value to stakeholders.
Building the first slice: We now built out our first component using a plethora of sensible defaults for planning, architecture, development, testing, deployment and more. We also built a simple API gateway component that acted as a passthrough to the legacy system for all requests except the /get_coupons API. The API gateway had minor transformation logic to retain backwards compatibility with existing consumers. For newer consumers, the new GET /coupons API was RESTful, paginated, allowed consumers to choose the outputs they needed, was comfortably way more performant, was well documented, was test-driven and came with several other improvements.
Ensuring consumer adoption: We prioritized consumer engagement and user satisfaction to promote timely adoption during thin-slice development. We constantly maintained transparent communication, informing the consumers of the migration's rationale, benefits and timeline. By addressing specific pain points and incorporating consumer feedback, we tailored the thin slices to their needs, driving higher adoption rates. We provided awareness sessions and responsive support to ensure consumers were confident using the new features. Additionally, we shared success stories from other consumers who embraced the changes, inspiring confidence in the transition. Collaborative decision-making allowed consumers to feel valued and empowered, fostering ownership and commitment to the new system (Note: in this context, we use “consumers” to mean the consumers of the API. However, it would be even more valuable to include feedback from the end users of the system).
Managing stakeholders: We placed a high priority on clear communications and effective stakeholder management to ensure success. We established regular communication channels, tailored information for different stakeholders and addressed concerns proactively. We also emphasized stakeholder engagement through regular demos of incremental functionality, involving them in decision-making and managing expectations. The collaborative and transparent approach enabled us to navigate challenges and deliver successfully.
At the end of the migration, the APIs looked like below:
API in old system |
API in new system |
GET /get_coupons | GET /coupons |
POST /clip_coupon | POST /coupons/{couponId}/clip |
POST /unclip_coupon |
POST /coupons/{couponId}/unclip |
GET /get_my_coupons |
GET /users/{userId}/coupons |