Many IT teams are focused on delivering new things: new products, services and internal tools, for example. However, as challenging as such work often is, the move from ‘build it’ to ‘run it’ after post-launch is rarely straightforward either.
Companies, more often than not, tend to adopt the reactive approach to software maintenance. This approach addresses problems only after they’ve occurred, often resulting in unexpected breakdowns and high repair costs.
A ‘break-and-fix’ mindset can be necessary in emergency situations, but it can also make things worse. While it can be tempting to view maintenance work as adding little value, failing to address these problems properly will only create future issues as you accumulate tech debt. Fixing those issues will require more resources — time, money, skills — that will undoubtedly hurt your organization.
This is why proactive maintenance is so important for organizations today. Identifying and addressing potential issues before they happen minimizes downtime and boosts operational efficiency.
The risks of a break-and-fix mindset
Tech debt is one of those “invisible issues” hiding in IT systems. Opting for quick fixes to solve immediate issues, rather than undertaking comprehensive upgrades might seem cost-effective and straightforward at first. However, over time, the accumulation of these patches contributes significantly to tech debt. This growing debt can escalate maintenance costs as the complexity of the system increases, reduce reliability due to potential bugs, introduce security vulnerabilities and create obstacles to innovation by diverting resources away from new developments. These issues can have serious repercussions for a company, damaging its reputation and negatively impacting its financial health.
This isn’t an isolated or marginal issue: it has been estimated that resolving existing tech debt in the U.S. alone would demand an investment of around $1.52 trillion. Moreover, the repercussions of tech debt, such as cybersecurity breaches, operational disruptions, failed projects and the maintenance of outdated systems, costs the U.S. economy approximately $2.41 trillion annually.
Supporting an ecosystem that hasn’t had preventative maintenance becomes a huge headache when products become unusable, customer experience declines and costs are sent spiraling.
When technical errors have an outsized impact
The recent AT&T network outage demonstrates the risks of reacting rather than planning ahead. An initial review of the incident — which affected millions of customers — found the disruption was caused by the "application and execution of an incorrect process used as we were expanding our network." What may have looked like a minor technical issue ultimately brought down the entire system.
The importance of proactive maintenance
This is why proactive maintenance is important — you take steps to tackle issues before they happen, actively working to reduce the risk of failures and downtime. There are many techniques and tools that can help here, but it’s particularly important to stress the value of real-time data analytics and feedback processes. Having transparency and visibility into how something is performing or to identify potential causes of failure is a vital foundation for proactive maintenance.
One example of a proactive approach in action can be found in Service Operations Centers (SOC), which are commonly used in the telecommunications industry. SOCs operate around the clock; they proactively monitor systems so teams can address issues before they affect users.
Indeed, appropriate service level agreements (SLAs) can ensure the impact of any technical issues are kept to a minimum. However, in a reactive setting SLAs can be ineffective. Quantitative metrics for fragmented parts of a service prevent a more holistic approach that can address the complex and interdependent nature of modern software applications and systems.
Coming face-to-face with the consequences of reactive maintenance
The value of proactive maintenance became clear to some of us at Thoughtworks when we were working with a client that builds business accounting software. We found a critical issue with something called a cron job (a command for scheduling specific tasks) that was responsible for synchronizing order data when clients subscribed.
Although we had a monitoring system in place, it had failed to detect that the cron job could potentially fail. If it did, the consequences could have been incredibly damaging for the client — billing processes would have been delayed or possibly even inaccurate, seriously hurting customer trust.
So significant were these potential consequences that measures were taken to enhance alerting systems. Working with the client, we ensured monitoring and scanning tools covered all scenarios and were properly configured for any potential disruptions.
From a situation that could have been catastrophic, we switched to an approach that ensured timely detection and resolution. This improved system reliability, reduced risks and optimized operational efficiency.
Proactive maintenance requires experienced support
Despite the potential consequences of inadequate and reactive maintenance, adopting a more proactive approach can be challenging for many businesses. Economic pressures and budgetary constraints are forcing leaders to reduce expenses and ‘do more with less’ — this leads to situations where areas not traditionally viewed as value-adding (like maintenance) are deprioritized.
This is where managed services can help. Used effectively, external expertise can help organizations ensure their projects, initiatives and day-to-day operations are always running effectively without having to build and invest in a brand new capability. It shouldn’t come as a surprise that this is a growing area: Mordor Research believes the market for managed services is going to grow from $260 billion in 2023 to $380 billion by 2028.
Thoughtworks’ AI-Powered DAMO™ Managed Services veer away from traditional approaches that focus on ticket resolution. Our AI-powered solution leverages real-time insights to predict and inform iterative improvements that can be made to your digital assets over time to maintain code quality and extend their lifespan. We believe that while sustaining may be table stakes today, there’s a mindset shift happening. The growing burden of tech debt and the pressure to navigate the tension between cost savings and value creation will drive future-fit organizations to partner with a managed service provider that can help them do proactive, evolutionary maintenance.
Read the next article in this series: Continuous discovery for managed services: Evolve and transform your digital assets