Generative AI (GenAI) can revolutionize software development. It has the ability to drive significant productivity gains for software developers and can accelerate both the innovation cycle and time to market. However, its potential impact may be severely hampered if viewed narrowly as little more than a tool for code generation. Such a view is, unfortunately, not uncommon; it rests on a misunderstanding of both GenAI and the practice of software development.
This means there’s a real opportunity for business and technology leaders willing to closely engage with the software development process and the work of their technology teams. By acknowledging where GenAI can support software developers and where it cannot, they can be the first to leverage GenAI effectively to gain a real competitive advantage, empowering software developers not to just work faster but to also work smarter.
Where can generative AI add value?
GenAI can undoubtedly add significant value throughout the software development lifecycle. However, it’s important to note that quantifying this value is exceptionally difficult. In fact, there’s a risk that the need to quantify GenAI's value-add is contributing to a narrowing of GenAI’s potential impact. At Thoughtworks we believe it is counterproductive to focus on the benefit gained from code productivity alone; we advocate a view that GenAI is part of the tooling which, along with ways of working and the right team topologies are essential, in our experience, to drive reduction in time-to-market, increase quality and continuously maintain team morale. This is distinct from the approach taken in recent studies and press which have focussed on measuring coding speeds in the context of bounded and relatively simple problems, leaving the value of GenAI on other parts of the software development process largely ignored.
These are often parts of software development work that are complex, typically points of friction or waste. While they may not lend themselves to easy measurement, they are, we believe, the parts where the use of GenAI may actually be the most impactful. In recent months, a small group of us at Thoughtworks have been running a number of experiments to explore the potential of GenAI tools such as ChatGPT and GitHub Copilot across the software development lifecycle. The results we’ve seen are promising, with results suggesting a 10–30% increase in productivity. It’s worth noting, though, that these productivity gains are dependent on three factors:
Developer experience: Engineers need to know what to ask for, and have the ability to judge GenAI output without impacting quality.
Experience with GenAI: These tools require human inputs, which means users need knowledge and skills to not only write effective prompts, but also the tools, learn what to use them for, and to move on when it is just a distraction
The nature of the problem at hand: GenAI is effective for problems that are well-defined — as it expands or becomes more complex, it will be much less likely to drive productivity gains.
Given those important caveats, let’s now take a look at some of the ways GenAI can add value to the work developers do.
Reducing repetitive work to create more time for high-value tasks
GenAI shines at pattern matching and pattern synthesis, such as translating one language into another. The most obvious use of that strength for software delivery is for a new kind of code generation, where the AI translates natural language into code, or one type of code into another. But this can also be taken advantage of in other areas, such as translating change logs into a release description, turning code and team chats into more coherent documentation or mapping unstructured information into more structured formats and templates. It could even help teams generate test and sample data.
In other words, it can remove some of the more time-consuming tasks so developers have more time on complex, value-adding work. It can do the pattern matching for us; we then augment the results and finish the ‘last mile’ ourselves.
More complete thinking, earlier
Large language models (LLMs) have the capacity to surprise us. This is why it is often said to ‘hallucinate’ — that is, produce an output that is misleading or false, seemingly at odds with the data on which it has been trained. While this can clearly be risky in certain use cases, their ability to offer up something unexpected makes them great brainstorming partners and tools for ideation. They can point out gaps in our thinking.
We’ve seen great results for product and strategy ideation, such as prompting an LLM to generate scenarios that can trigger divergent thinking. We’ve also used LLMs as sparring partners to enhance user stories and testing scenarios. For instance, if we’re trying to ideate the different ways a given application may be used, LLMs can help expand our thinking, filling gaps with scenarios that we hadn’t thought about. The benefit of this is that, by capturing requirements more effectively, we reduce the need for rework later — greatly improving the speed of the development process.
Finding information just in time
One of the largest sources of inefficiency for software developers is finding the right information. From online searches to internal documentation, knowing where you need to go to find what you need can be a big overhead.
GenAI offers the opportunity to provide new types of search functionalities on top of lots of unstructured sources of information. This is already happening: GitHub’s Copilot CHAT (currently in beta at the time of writing), builds on Copilot’s existing coding assistance functionality to provide natural language and context-specific support to developers. Similarly, Atlassian Intelligence offers users a way to navigate and search dense and unstructured institutional information. Of course, it’s crucial that GenAI systems are appropriately integrated and trained on the necessary data, but when used effectively, they can give software delivery teams easy access to information in the context of their current task. This also opens up new ways for organizations to surface particularly critical information, such as having tools to remind users of compliance or security issues that need to be checked.
While GenAI chatbots shouldn’t be seen as a total replacement for in-depth and sourced research — and should always be monitored for accuracy and ‘hallucinations’ — if they are trained so they take the user’s context into account, they are very effective in minimizing friction and driving productivity.
The risks
The risks of GenAI are no secret. ‘Hallucinations’, bias and privacy have been widely discussed and debated in recent months. In the context of the software development lifecycle, those risks will manifest themselves differently depending on the capabilities, culture and objectives of engineering teams.
For instance, introducing GenAI tools to a team of inexperienced developers has the potential to undermine, rather than augment, their efficiency and the quality of the software they deliver. For example, when faced with code that doesn’t work that’s been generated by GenAI, an inexperienced developer may unnecessarily commit to a solution and end up spending more time trying to make the AI-generated code work when they would have been better off reading the relevant documentation.
In short, given the possibility of error and hallucinations, it’s essential that outputs are always treated with caution. This is particularly true when repeatability is crucial: without the requisite level of attention, layering GenAI on top of immature practices can exacerbate and entrench existing problems rather than solve them.
Looking ahead: the importance of mature engineering practices
To take advantage of GenAI and leverage it to its full potential, organizations should embrace good practice when it comes to software engineering — this includes everything from continuous integration/continuous deployment (CI/CD) to DevOps. These practices are arguably more important in an age of GenAI than they were previously because they make it easier to both measure and manage process change. If you have delivery metrics baked into your workflows, for example, determining the impact of GenAI — as well as any emergent challenges it might be posing — can be quickly remedied.
These practices can be further supported by an effective AI operating model. This is a strategic plan that articulates how artificial intelligence is to be used across the organization, providing guidance and governance where necessary. The benefit is that it can allow organizations to ensure they remain true to their strategic aims, culture and existing processes in spite of the rapid pace of change. Whether it guides tooling decisions or empowers teams to respond to changing regulatory requirements, a robust operating model makes it easier to remain abreast of these developments and adapt accordingly.
This isn’t to say organizations need a top-down approach to GenAI. It’s much more about building feedback loops that ensure awareness and alignment between what’s happening at the grassroots and strategic decision making. Such feedback loops are essential to organizational success in periods of rapid technological change; it can be tempting to follow what’s happening in the market, but these feedback loops help leadership teams ensure the decisions they make are always informed by and closely connected to what’s happening in their organization.
This isn’t to say being vigilant in terms of the market is to be avoided, but instead to recognize that what really matters is the organization's ability — and willingness — to properly empower development teams. As leaders, we need to support them to drive their own productivity through their curiosity and existing expertise.