Enable javascript in your browser for better experience. Need to know to enable it? Go here.
第32期 | 四月2025

技术

  • 技术

    采纳 试验 评估 暂缓 采纳 试验 评估 暂缓
  • 新的
  • 移进/移出
  • 没有变化

技术

采纳 ?

试验 ?

  • 5. API 请求集合做为 API 产品的制品

    API 视为产品 意味着优先考虑开发者体验,不仅需要在 API 本身中融入合理和标准化的设计,还需要提供全面的文档以及流畅的入门体验。虽然 OpenAPI(如 Swagger)规范可以有效记录 API 接口,但入门仍然是一个挑战。开发者需要快速获取可用的示例,包括预配置的身份验证和真实的测试数据。随着 API 客户端工具(例如 PostmanBrunoInsomnia)的逐步成熟,我们建议将 API 请求集合做为 API 产品的制品 。API 请求集合应经过精心设计,以引导开发者完成关键工作流程,帮助他们轻松理解 API 的领域语言和功能。为了保持请求集合的最新状态,我们建议将其存储在代码库中,并将其与 API 的发布流水线集成在一起。

  • 6. 架构建议流程

    在大型软件团队中,一个持久的挑战是确定由谁来做出塑造系统演进的架构决策。State of DevOps 报告 显示,传统的架构评审委员会(Architecture Review Boards)方式往往适得其反,不仅阻碍了工作流,还与低组织绩效相关。一种引人注目的替代方案是 架构建议流程 —— 这是一种去中心化的方法,任何人都可以做出架构决策,前提是他们向受影响的人和具有相关专业知识的人寻求建议。这种方法使团队能够在不牺牲架构质量的前提下优化工作流,无论是在小规模还是大规模环境中。乍看之下,这种方法似乎具有争议性,但像 架构决策记录(ADR) 和建议论坛这样的实践能够确保决策是经过充分信息支持的,同时赋予那些最接近实际工作的人员决策权。我们看到这一模型在越来越多的组织中取得了成功,包括那些处于高度监管行业的组织。

  • 7. GraphRAG

    在上次关于检索增强生成(RAG)的更新中,我们已经介绍了 GraphRAG 。它最初在微软的文章中被描述为一个两步的流程:(1) 对文档进行分块,并使用基于大语言模型的分析构建知识图谱;(2) 通过嵌入检索的方式在查询时检索相关块,沿着知识图谱的边缘发现更多相关的分块,这些分块后续会被添加到增强提示中。在许多情况下,这种方法提高了大语言模型生成的响应数据的质量。我们在使用生成式 AI 理解遗留代码库的过程中也观察到了类似的好处——通过像抽象语法树和代码依赖这样的结构化信息去构建知识图谱。GraphRAG 模式正在获得更多的关注,像 Neo4j 的GraphRAG Python 库这样的工具与架构正在不断出现以支持该模式。同时,我们认为Graphiti也符合广义上的 GraphRAG 模式。

  • 8. 按需特权访问管理(Just-in-time Privileged Access Management)

    最小权限原则 确保用户和系统仅拥有执行任务所需的最低权限。特权凭证滥用是 安全漏洞 的主要原因之一,其中权限提升是常见的攻击向量。攻击者通常从低级访问权限开始,利用软件漏洞或配置错误获取管理员权限,尤其是在账号拥有过多或不必要权限时。另一个常被忽视的风险是静态特权(standing privileges)——即持续可用的特权访问,这大大增加了攻击面。 按需特权访问管理(Just-in-time Privileged Access Management) 有效缓解了这一问题,通过仅在需要时授予访问权限,并在任务完成后立即撤销权限,从而最大限度地降低暴露风险。真正的最小权限安全模型确保用户、应用程序和系统仅在最短时间内拥有完成任务所需的必要权限,这是合规性和监管安全的关键要求。我们的团队通过自动化工作流程实现了这一模型:触发轻量化的审批流程,为用户分配临时角色并限制访问权限,同时为每个角色强制设置生存时间(TTL),确保权限在任务完成后自动过期,从而进一步减少特权滥用的风险。

  • 9. 模型蒸馏

    Scaling laws 是推动 AI 快速发展的关键原则之一,即更大的模型、更大的数据集和更多的计算资源能够带来更强大的 AI 系统。然而,消费级硬件和边缘设备往往缺乏运行大尺寸模型的能力,因此产生了对 模型蒸馏 的需求。

    模型蒸馏 将知识从一个更大、更强的模型(教师模型)转移到一个更小、更高效的模型(学生模型)。这一过程通常包括从教师模型生成一个样本数据集,并对学生模型进行微调,以捕获其统计特性。与通过移除参数来压缩模型的 剪枝 技术或 量化 不同,蒸馏旨在保留领域特定的知识,同时将精度损失降到最低。此外,蒸馏还可以与量化结合使用,以进一步优化模型。

    这种技术最早由 Geoffrey Hinton 等人提出,现已被广泛应用。一个显著的例子是 Qwen/Llama 的 DeepSeek R1 蒸馏版本,它们在小模型中保留了强大的推理能力。随着蒸馏技术的日益成熟,它已不再局限于研究实验室,而是被广泛应用于从工业项目到个人项目的各类场景中。像 OpenAIAmazon Bedrock 这样的供应商也提供了详细的指南,帮助开发者蒸馏自己的 小语言模型(SLMs)。我们认为,采用模型蒸馏技术能够帮助组织更好地管理 LLM 部署成本,同时释放 本地设备上 LLM 推理 的潜力。

  • 10. 提示工程(Prompt Engineering)

    提示工程(Prompt Engineering) 是指为生成式 AI 模型设计与优化提示词(Prompt)的过程,其目标是生成高质量、上下文相关(Context-aware)的响应。这一过程通常包括针对特定任务或应用场景,精心构建清晰、具体且上下文相关的提示,以实现模型输出效果的最优化。随着大语言模型能力的不断提升,尤其是推理模型的出现,提示工程的实践也必须随之调整。根据我们在 AI 代码生成方面的经验,少样本提示(few-shot prompting) 在与推理模型协作时,可能不如简单的零样本提示(zero-shot prompting)表现出色。此外,被广泛使用的思维链(Chain-of-Thought,CoT)提示技术也可能降低推理模型的表现——原因可能在于当前推理模型通过强化学习已内置了 微调过的 CoT 机制

    我们的实际经验也得到了学术研究的印证,即“高级模型可能消除软件工程领域对提示工程的依赖”。但在实际场景中,传统提示 工程技术仍然是减少模型幻觉(Hallucinations)并提升输出质量的重要手段,特别是在考虑推理模型与普通 LLM 在响应时间和 Token 成本等因素存在显著差异的前提下。在构建自主代理应用(Agentic Applications)时,我们建议根据实际需求策略性地选择模型,并持续迭代与优化提示模板及相应的工程方法。如何在性能、响应速度与 Token 成本之间找到最佳平衡,依然是充分发挥 LLM 效能的关键所在。

  • 11. 小语言模型(SLMs)

    最近发布的 DeepSeek R1 充分展示了 小语言模型(SLMs) 为何仍然备受关注。满血版 R1 拥有 6710 亿个参数,并且需要约 1342GB 的 VRAM 才能运行,这通常只能通过八块最先进的 NVIDIA GPU 组成的“迷你集群”来实现。然而,DeepSeek 也提供了“蒸馏版”,即 Qwen 和 Llama 等更小的开放权重模型,使其能力得以迁移,并能够在更普通的硬件上运行。尽管这些小型版本在性能上有所折损,但相较于以往的小语言模型,依然实现了巨大的性能飞跃。小语言模型领域仍在不断创新。自上次技术雷达以来,Meta 推出了 Llama 3.2,涵盖 10 亿和 30 亿参数规模;微软发布了 Phi-4,其 140 亿参数模型在质量上表现出色;谷歌则推出了 PaliGemma 2,一个支持视觉-语言任务的模型,提供 30 亿、100 亿和 280 亿参数版本。这些只是近期发布的小型模型中的一部分,但无疑表明了这一趋势仍值得持续关注。

  • 12. 利用生成式AI理解遗留代码库

    过去几个月, 利用生成式 AI 理解遗留代码库 这一领域取得了显著进展。主流工具如 GitHub Copilot 已被广泛宣传能够帮助现代化改造遗留代码库。类似 Sourcegraph Cody 等工具,也正在让开发者更轻松地导航和理解整个代码库。这些工具综合运用多种生成式 AI 技术提供上下文感知(Context-aware)的帮助,极大地简化了对复杂遗留系统的分析与处理。此外,S3LLM 等专业框架则展示了大语言模型(LLMs)如何有效处理大规模科学计算软件(例如 Fortran 或 Pascal),将 GenAI 强化的代码理解能力推广到传统企业 IT 以外的场景。我们认为,鉴于全球范围内大量的遗留代码,这种技术未来将持续获得更多关注并加速普及。

评估 ?

  • 13. AI-friendly code design

    Supervised software engineering agents are increasingly capable of identifying necessary updates and making larger changes to a codebase. At the same time, we're seeing growing complacency with AI-generated code, and developers becoming reluctant to review large AI-made change sets. A common justification for this is that human-oriented code quality matters less since AI can handle future modifications; however, AI coding assistants also perform better with well-factored codebases, making AI-friendly code design crucial for maintainability.

    Fortunately, good software design for humans also benefits AI. Expressive naming provides domain context and functionality; modularity and abstractions keep AI’s context manageable by limiting necessary changes; and the DRY (don’t repeat yourself) principle reduces duplicate code — making it easier for AI to keep the behavior consistent. So far, the best AI-friendly patterns align with established best practices. As AI evolves, expect more AI-specific patterns to emerge, so thinking about code design with this in mind will be extremely helpful.

  • 14. AI-powered UI testing

    New techniques for AI-powered assistance on software teams are emerging beyond just code generation. One area gaining traction is AI-powered UI testing, leveraging LLMs' abilities to interpret graphical user interfaces. There are several approaches to this. One category of tools uses multi-modal LLMs fine-tuned for UI snapshot processing, allowing test scripts written in natural language to navigate an application. Examples in this space include QA.tech or LambdaTests' KaneAI. Another approach, seen in Browser Use, combines multi-modal foundation models with Playwright's insights into a web page's structure rather than relying on fine-tuned models.

    When integrating AI-powered UI tests into a test strategy, it’s crucial to consider where they provide the most value. These methods can complement manual exploratory testing, and while the non-determinism of LLMs may introduce flakiness, their fuzziness can be an advantage. This could be useful for testing legacy applications with missing selectors or applications that frequently change labels and click paths.

  • 15. Competence envelope as a model for understanding system failures

    The theory of graceful extensibility defines the basic rules governing adaptive systems, including the socio-technical systems involved in building and operating software. A key concept in this theory is the competence envelope — the boundary within which a system can function robustly in the face of failure. When a system is pushed beyond its competence envelope, it becomes brittle and is more likely to fail. This model provides a valuable lens for understanding system failure, as seen in the complex failures that led to the 2024 Canva outage. Residuality theory, a recent development in software architecture thinking, offers a way to test a system’s competence envelope by deliberately introducing stressors and analyzing how the system has adapted to historical stressors over time. The approaches align with concepts of anti-fragility, resilience and robustness in socio-technical systems, and we’re eager to see practical applications of these ideas emerge in the field.

  • 16. Structured output from LLMs

    Structured output from LLMs refers to the practice of constraining a language model's response into a defined schema. This can be achieved either by instructing a generalized model to respond in a particular format or by fine-tuning a model so it "natively" outputs, for example, JSON. OpenAI now supports structured output, allowing developers to supply a JSON Schema, pydantic or Zod object to constrain model responses. This capability is particularly valuable for enabling function calling, API interactions and external integrations, where accuracy and adherence to a format are critical. Structured output not only enhances the way LLMs can interface with code but also supports broader use cases like generating markup for rendering charts. Additionally, structured output has been shown to reduce the chance of hallucinations in model outputs.

暂缓 ?

  • 17. AI-accelerated shadow IT

    AI is lowering the barriers for noncoders to build and integrate software themselves, instead of waiting for the IT department to get around to their requirements. While we’re excited about the potential this unlocks, we’re also wary of the first signs of AI-accelerated shadow IT. No-code workflow automation platforms now support AI API integration (e.g., OpenAI or Anthropic), making it tempting to use AI as duct tape — stitching together integrations that previously weren’t possible, such as turning chat messages in one system into ERP API calls via AI. At the same time, AI coding assistants are becoming more agentic, enabling noncoders with basic training to build internal utility applications.

    This has all the hallmarks of the next evolution of the spreadsheets that still power critical processes in some enterprises — but with a much bigger footprint. Left unchecked, this new shadow IT could lead to a proliferation of ungoverned, potentially insecure applications, scattering data across more and more systems. Organizations should be aware of these risks and carefully weigh the trade-offs between rapid problem-solving and long-term stability.

  • 18. Complacency with AI-generated code

    As AI coding assistants continue to gain traction, so does the growing body of data and research highlighting concerns about complacency with AI-generated code. GitClear's latest code quality research shows that in 2024, duplicate code and code churn have increased even more than predicted, while refactoring activity in commit histories has declined. Also reflecting AI complacency, Microsoft research on knowledge workers found that AI-driven confidence often comes at the expense of critical thinking — a pattern we’ve observed as complacency sets in with prolonged use of coding assistants. The rise of supervised software engineering agents further amplifies the risks, because when AI generates larger and larger change sets, developers face greater challenges in reviewing results. The emergence of "vibe coding" — where developers let AI generate code with minimal review — illustrates the growing trust of AI-generated outputs. While this approach can be appropriate for things like prototypes or other types of throw-away code, we strongly caution against using it for production code.

  • 19. Local coding assistants

    Organizations remain wary of third-party AI coding assistants, particularly due to concerns about code confidentiality. As a result, many developers are considering using local coding assistants — AI that runs entirely on their machines — eliminating the need to send code to external servers. However, local assistants still lag behind their cloud-based counterparts, which rely on larger, more capable models. Even on high-end developer machines, smaller models remain limited in their capabilities. We've found that they struggle with complex prompts, lack the necessary context window for larger problems and often cannot trigger tool integrations or function calls. These capabilities are especially essential to agentic workflows, which is the cutting edge in coding assistance right now.

    So while we recommend to proceed with low expectations, there are some capabilities that are valid locally. Some popular IDEs do now embed smaller models into their core features, such as Xcode's predictive code completion and JetBrains' full-line code completion. And locally runnable LLMs like Qwen Coder are a step forward for local inline suggestions and handling simple coding queries. You can test these capabilities with Continue, which supports the integration of local models via runtimes like Ollama.

  • 20. Replacing pair programming with AI

    When people talk about coding assistants, the topic of pair programming inevitably comes up. Our profession has a love-hate relationship with it: some swear by it, others can't stand it. Coding assistants now raise the question: can a human pair with the AI instead of another human and get the same results for the team? GitHub Copilot even calls itself "your AI pair programmer." While we do think a coding assistant can bring some of the benefits of pair programming, we advise against fully replacing pair programming with AI. Framing coding assistants as pair programmers ignores one of the key benefits of pairing: to make the team, not just the individual contributors, better. Coding assistants can offer benefits for getting unstuck, learning about a new technology, onboarding or making tactical work faster so that we can focus on the strategic design. But they don't help with any of the team collaboration benefits, like keeping the work-in-progress low, reducing handoffs and relearning, making continuous integration possible or improving collective code ownership.

  • 21. Reverse ETL

    We're seeing a worrying proliferation of so-called Reverse ETL. Regular ETL jobs have their place in traditional data architectures, where they transfer data from transaction processing systems to a centralized analytics system, such as a data warehouse or data lake. While this architecture has well-documented shortcomings, many of which are addressed by a data mesh, it remains common in enterprises. In such an architecture, moving data back from a central analytics system to a transaction system makes sense in certain cases — for example, when the central system can aggregate data from multiple sources or as part of a transitional architecture when migrating toward a data mesh. However, we're seeing a growing trend where product vendors use Reverse ETL as an excuse to move increasing amounts of business logic into a centralized platform — their product. This approach exacerbates many of the issues caused by centralized data architectures, and we suggest exercising extreme caution when introducing data flows from a sprawling, central data platform to transaction processing systems.

  • 22. SAFe™

    We see continued adoption of SAFe™ (Scaled Agile Framework®). We also continue to observe that SAFe's over-standardized, phase-gated processes create friction, that it can promote silos and that its top-down control generates waste in the value stream and discourages engineering talent creativity, while limiting autonomy and experimentation in teams. A key reason for adoption is the complexity of making an organization agile, with enterprises hoping that a framework like SAFe offers a simple, process-based shortcut to becoming agile. Given the widespread adoption of SAFe — including among our clients — we’ve trained over 100 Thoughtworks consultants to better support them. Despite this in-depth knowledge and no lack of trying we come away thinking that sometimes there just is no simple solution to a complex problem, and we keep recommending leaner, value-driven approaches and governance that work in conjunction with a comprehensive change program.

    Scaled Agile Framework® and SAFe™ are trademarks of Scaled Agile, Inc.

无法找到需要的信息?

 

每期技术雷达中的条目都在试图反映我们在过去六个月中的技术洞见,或许你所搜索的内容已经在前几期中出现过。由于我们有太多想要谈论的内容,有时候不得不剔除一些长期没有发生变化的条目。技术雷达来自于我们的主观经验,而非全面的市场分析,所以你可能会找不到自己最在意的技术条目。

下载 PDF

 

English | Español | Português | 中文

订阅技术雷达简报

 

立即订阅

查看存档并阅读往期内容