AI on Technology Radar Vol.32

Richard Gall

Published: April 17, 2025

The latest edition of the Thoughtworks Technology Radar features 48 AI-related blips — just under half the total number. When putting together this edition, we did have some reservations: was our attention being diverted by the current hype?

On reflection, though, AI’s presence in this Radar reflects the way that AI is infusing just about everything in software. From how we build to what we’re building, the reality is that so much of what’s new is shaped by the sheer scale of investment in AI.

However, there are some nuances. When we talk about AI today it’s important to note we’re potentially talking about a whole range of things, from LLM apps to software assistants to synthetic data. Indeed, as this field becomes embedded in what we do we may find that we talk less about ‘AI’ as a catch-all term and more about specific approaches or goals, not unlike how the industry thinks of cloud today.

To help give a clearer picture of the AI-related blips on this edition of the Radar, though, we’ve grouped them together across a number of different areas. We think it provides a good snapshot of what AI actually looks like in practice across the software industry — helpful if you want to get beneath the hype and headlines.

AI-assisted software development

AI-assisted software development was a significant area of discussion for this Technology Radar — and that’s borne out in the blips that feature.

We included a number of coding assistants, including Cursor, Cline and Windsurf [all three Tools/Trial]. Clearly the field is much richer than just GitHub Copilot and we’d encourage software developers to explore what’s available. Each has different features and design and so encourages a particular style — some of which may be more appropriate and appealing than others. All of the coding assistants mentioned above have agentic capabilities, which is why we also included software engineering agents [Tools/Trial]. While we’re cautious about the rise of vibe coding, we’re excited to explore how agents can improve our workflows.

While there’s lots to be excited about when it comes to AI assistance in software development, we did use the Radar to flag some risks and anti-patterns. For instance, we think it’s essential to guard against complacency about AI-generated code [Techniques/Hold] and are concerned about the tendency to see AI as a replacement for pair programming [Techniques/Hold]. For this reason, we emphasised AI-friendly code design [Techniques/Assess] are particularly important if you’re interacting with AI tools in your development workflow. We also stress that AI should be about team gains, not individual ones — we moved Unblocked [Platforms] (an AI team assistant) from Assess to Trial in this volume.

We’re also cautious about local coding assistants [Techniques/Hold]. As we write on the Radar, “we've found that they struggle with complex prompts, lack the necessary context window for larger problems and often cannot trigger tool integrations or function calls.” That said, there will likely be evolution in this area in the months to come, so pay attention if this is something you’re interested in exploring.

Finally, it’s worth highlighting that generative AI offers software development much more than just code generation. We’re particularly excited about how it can help us better understand legacy codebases [Techniqes/Trial], speeding up modernization initiatives.

AI models

There were some very interesting developments in AI models in the weeks before we put together the Radar; namely, the emergence of DeepSeek R1 [Platforms/Assess]. The response was very noisy, as much from a financial and geopolitical one as a technical one. That was a bit of a shame — the work done by the DeepSeek team was highly innovative, and the performance gains they unlocked signal there are alternatives to the somewhat crude philosophy of ‘bigger is better’ when it comes to LLMs.

Also on this volume is Claude Sonnet [Tools/Trial], which we’ve been really impressed with. In particular it’s showing good results when it comes to coding, something we’ve been exploring recently.

While LLMs may catch people’s attention, it’s also worth highlighting other kinds of models in this edition. We’re excited about, for instance, reasoning models [Platforms/Assess], which possess enhanced capabilities around step-by-step thinking and self-correction, and small language models [Techniques/Trial], which can be run on edge devices. It’s also worth mentioning ModernBERT [Tools/Assess], a successor to BERT. It’s an encoder-only transformer model designed for natural language processing tasks.

Fine-tuning and quantization

DeepSeek has got us very interested in other future innovations in model optimization and fine-tuning. We’ve flagged a number of things on this volume of the Radar in this area, including model distillation [Techniques/Trial], a technique where knowledge is transferred from a larger and more powerful model to a smaller, more efficient one, and torchtune [Languages and Frameworks/Assess], a Python library for authoring, post-training and experimenting with LLMs.

Observability and interpretability; evals and guardrails

As the world seeks to bring AI models and systems into production, gaining observability into a system’s performance, assessing accuracy and reliability is becoming more and more important. We once again featured NeMo Guardrails [Tools/Trial] on the Radar, this time moving it into Trial after it featured in Assess in April 2024. It’s been widely adopted across our teams on AI projects.

We’ve previously featured Langfuse on the Radar, but in this edition we wanted to highlight some alternatives: Arize Phoenix [Platforms/Assess], Weights & Biases Weave [Platforms/Assess] and Helicone [Platforms/Assess] are all worth exploring.

One tool that cuts across observability and evaluation we like is Humanloop [Platforms/Assess], which can help teams address a range of LLM challenges, from cost management to tracing and alerting. We think it could be particularly useful in highly regulated domains where managing risk is critical.

Interpretability is sometimes confused with observability, but it’s more about validating outputs to ensure it’s reliable and accurate. We’re seeing growth in this area, but one tool worth looking at now is Gemma Scope [Tools/Assess].

LLM app development

We may not have yet arrived at a boom of frameworks for building LLMs quite like the JavaScript framework boom of the mid-2010s, but we are seeing considerable growth. LangChain was arguably the framework that led the way, but, as we noted 12 months ago, there were many reasons to be cautious.

In this edition, then, we were pleased to be able to highlight a number of alternative frameworks. We’ve already mentioned Weights & Biases, but we also featured PydanticAI [Languages and Frameworks/Assess], “a Python agent framework designed to make it less painful to build production grade applications with Generative AI” (according to the project itself). “Rather than trying to be a Swiss Army knife,” we note, “PydanticAI offers a lightweight yet powerful approach. It integrates with all major model APIs, includes built-in structured output handling and introduces a graph-based abstraction for managing complex agentic workflows.”

Information retrieval

Retrieval was a key topic during the conversations for this volume, so much so that we made it a theme: R in RAG. We’ve written recently about a number of intriguing techniques for RAG, and a couple featured in this volume: GraphRAG [Techniques/Trial] and Fast GraphRAG [Languages and Frameworks/Assess], both of which leverage knowledge graphs to improve the process of retrieval.

Other tools on this edition of the Radar that can support information retrieval include Graphiti [Platforms/Assess] — which builds dynamic, temporally-aware knowledge graphs — and VectorChord [Platforms/Assess], a PostgreSQL extension for high-performance vector similarity search.

Synthetic data

Synthetic data has an important role to play in what we call AI-readiness — we wrote about it recently. Unsurprisingly, the synthetic data tooling market is growing fast. On this volume of the Radar we featured Tonic [Platforms/Assess] and Synthesized [Platforms/Assess].

The AI landscape is evolving rapidly

It might sound like a truism to say the AI landscape is evolving at a dizzying pace, but recent editions of the Technology Radar demonstrate that it’s undeniable.

But what’s also undeniable is the importance of tracking and monitoring these changes. While the market may be convinced that AI is inevitable, in reality it’s a complex and fragmented field that’s yet to mature and stabilize (if it ever does).

That’s why we see the Radar as a really valuable publication. It will always be a place where you can track changes in the industry and get the perspective of people who actually use technology — so keep an eye on future volumes to find out what’s next in AI.

Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Thoughtworks.