Enable javascript in your browser for better experience. Need to know to enable it? Go here.
第31期 | 十月2024

语言 & 框架

  • 语言 & 框架

    采纳 试验 评估 暂缓 采纳 试验 评估 暂缓
  • 新的
  • 移进/移出
  • 没有变化

语言 & 框架

采纳 ?

  • 75. dbt

    我们仍然认为 dbt是在 ELT 数据管道中实施数据转换的强大且合理的选择。我们喜欢它能够引入工程上的严谨性,并支持模块化、可测试性和 SQL 转换的可重用性等实践。dbt 集成了很多云数据仓库、数据湖仓和数据库,包括SnowflakeBigQuery、Redshift、Databricks 和 Postgres,并且拥有健康的社区包生态系统。最近在 dbt core 1.8+ 和全新的 dbt Cloud 无版本体验中引入的原生单元测试支持,进一步巩固了其在我们工具箱中的地位。我们的团队很欣赏这个新的单元测试功能,它使他们能够轻松定义静态测试数据,设置输出预期,并测试管道的增量和完全刷新模式。在许多情况下,这使得团队能够淘汰自制脚本,同时保持相同的质量水平。

  • 76. Testcontainers

    在我们的经验中,Testcontainers 是创建可靠测试环境的一个有效默认选项。它是一个移植到多种语言的库,可以将常见的测试依赖项进行 Docker 化——包括各种类型的数据库、队列技术、云服务以及像网页浏览器这样的 UI 测试依赖项,并能够在需要时运行自定义 Dockerfile。最近发布了一个 桌面版本,允许对测试会话进行可视化管理,并能够处理更复杂的场景,这对我们的团队非常有用。

试验 ?

  • 77. CAP

    CAP 是一个实现了 Outbox 模式 的 .NET 库。 在使用 RabbitMQ 或 Kafka 等分布式消息系统时,我们经常面临确保数据库更新和事件发布按原子性执行的挑战。CAP 通过在引发事件的同一个数据库事务中记录事件发布的意图,解决了这一问题。我们发现 CAP 非常实用,它支持多种数据库和消息平台,并确保至少一次的消息投递。

  • 78. CARLA

    CARLA 是一个开源的自动驾驶模拟器,它可用于在生产部署之前测试自动驾驶系统。它在创建和重用车辆、地形、人类、动物等 3D 模型上具有很强的灵活性,这使得它可以用来模拟各种场景,比如模拟行人走上马路或遇到迎面而来的特定速度的车辆。那些待测试的自动驾驶系统必须能够识别这些动态参与者并采取适当的行动,例如刹车。我们的团队使用 CARLA 进行自动驾驶系统的持续开发和测试。

  • 79. Databricks Asset Bundles

    Databricks Asset Bundles (DABs),于 2024 年 4 月 实现基本可用,正逐渐成为打包和部署 Databricks 资产的首选工具,帮助我们的数据团队应用软件工程实践。DABs 支持将工作流和任务的配置以及要在这些任务中执行的代码打包成一个 bundle,并通过 CI/CD 管道部署到多个环境中。它提供了常见资产类型的模板,并支持自定义模板,这使得能够为数据工程和机器学习项目创建定制服务模版。我们的团队越来越多地将其作为工程工作流中的关键部分。尽管 DABs 包含了针对 notebooks 的模板,并支持将 notebooks 部署到生产环境中,我们并不推荐 生产化 notebooks ,而是鼓励有意地编写符合生产标准的代码,以支持此类工作负载的可维护性、可靠性和可扩展性需求。

  • 80. Instructor

    当我们作为终端用户使用大语言模型(LLM)聊天机器人时,它们通常会返回给我们一个非结构化的自然语言答案。在构建超越聊天机器人的生成 AI 应用时,请求 LLM 以 JSON、YAML 或其他格式返回结构化答案并在应用中解析和使用该响应可能非常有用。然而,由于 LLM 是非确定性的,它们可能并不总是按照我们要求的方式执行。Instructor 是一个可以帮助我们请求 从 LLMs 获取结构化输出 的库。您可以定义预期的输出结构,并在 LLM 未返回您要求的结构时配置重试。由于与 LLM 进行交互时,最佳的终端用户体验往往是将结果流式传输给他们,而不是等待完整响应,Instructor 还可以处理从流中解析部分结构的任务。

  • 81. Kedro

    Kedro 作为 MLOps 工具有了显著的改善,并始终保持对模块化和工程实践的关注,这一点是我们从一开始就非常喜欢的。突出其模块化的一步是推出了独立的 kedro-datasets 包,该包将代码与数据解耦。Kedro 在其命令行接口、起始项目模板和遥测功能方面进行了增强。 此外,最近发布的 VS Code 扩展对开发者体验也是一个很好的提升。

  • 82. LiteLLM

    LiteLLM 是一个用于无缝集成各种大语言模型(LLM)提供商 API 的库,通过OpenAI API 格式交互。它支持多种 提供方和模型 ,并为文本生成、嵌入和图像生成提供统一的接口。LiteLLM 简化了集成过程,通过匹配每个提供商的特定端点要求来翻译输入。它还提供了实现生产应用中所需的操作功能的框架,如缓存、日志记录、速率限制和负载均衡,从而确保不同 LLM 的一致操作。我们的团队使用 LiteLLM 来轻松切换各种模型,这在模型快速演变的今天尤为必要。然而,值得注意的是,不同模型在相同提示下的响应会有所不同,这表明仅一致的调用方式可能不足以优化生成性能。此外,每个模型对附加功能的实现方式也各不相同,单一接口可能无法满足所有需求。例如,我们的一个团队在通过 LiteLLM 代理 AWS Bedrock 模型时,难以充分利用其函数调用功能。

  • 83. LlamaIndex

    LLamaIndex 包含能够设计特定领域、上下文增强的 LLM 应用程序的引擎,并支持数据摄取、向量索引和文档上的自然语言问答等任务。我们的团队使用 LlamaIndex 构建了一个 检索增强生成 (RAG) 流水线,自动化文档摄取,索引文档嵌入,并根据用户输入查询这些嵌入。使用 LlamaHub,您可以扩展或自定义 LlamaIndex 模块以满足您的需求,例如构建使用您首选的 LLM、嵌入和向量存储提供者的 LLM 应用程序。

  • 84. LLM Guardrails

    LLM Guardrails 是一套用于防止大语言模型(LLMs)生成有害、使人误解或不相关内容的指南、政策或过滤器。Guardrails 也可用于保护 LLM 应用免受恶意用户通过操纵输入等技术对其滥用。它们通过为模型设定边界来作为安全网,确保内容的处理和生成在可控范围内。在这一领域中,诸如 NeMo GuardrailsGuardrails AIAporia Guardrails 等框架已经逐渐崭露头角,并被我们的团队认为非常有用。我们建议每个 LLM 应用都应设置相应的安全护栏,并且不断改进其规则和政策。这对于构建负责任和值得信赖的 LLM 聊天应用至关重要。

  • 85. Medusa

    根据我们的经验,大多数用于构建购物网站的电子商务解决方案通常会陷入 80/20 陷阱,即我们可以轻松构建出 80% 的需求,但对于剩下的 20% 却无能为力。Medusa 提供了一个良好的平衡。它是一个高度可定制的开源商业平台,允许开发人员创建独特且量身定制的购物体验,可以自我托管或运行在 Medusa 的平台上。Medusa 基于 Next.js 和 PostgreSQL 构建,通过提供从基本购物车和订单管理到高级功能(如礼品卡模块和不同地区的税收计算)的全面模块,加快了开发过程。我们发现 Medusa 是一个有价值的框架,并在几个项目中应用了它。

  • 86. Pkl

    Pkl 是一种开源的配置语言及工具,最初由苹果公司内部使用而创建。它的主要特点是其类型和验证系统,能够在部署之前捕捉配置错误。Pkl 帮助我们的团队减少了代码重复(例如环境覆盖的情况),并且能够在配置更改应用到生产环境之前进行验证。它可以生成 JSON、PLIST、YAML 和 .properties 文件,并且具有包括代码生成在内的广泛集成开发环境(IDE)和语言支持,

  • 87. ROS 2

    ROS 2是一个为开发机器人系统设计的开源框架。它提供了一套用于模块化实现应用程序的库和工具,涵盖了诸如进程间通信、多线程执行和服务质量等功能。ROS 2 在其前任的基础上进行了改进,提供了更好的实时性能、更高的模块化、对多种平台的支持以及合理的默认设置。ROS 2 在汽车行业正在获得越来越多的关注;它基于节点的架构和基于主题的通信模型,特别适合那些具有复杂且不断发展的车载应用(如自动驾驶功能)的制造商。

  • 88. seL4

    在软件定义汽车 (SDV) 或其他安全关键场景中,操作系统的实时稳定性至关重要。由于该领域的高准入门槛,少数公司垄断了这一领域,因此像 seL4这样的开源解决方案显得尤为珍贵。seL4 是一个高保障、高性能的操作系统微内核。它使用 形式化验证 方法来“数学上”确保操作系统的行为符合规范。其微内核架构还将核心职责最小化,以确保系统的稳定性。我们已经看到像蔚来汽车 (NIO) 这样的电动汽车公司参与 seL4 生态系统,未来在这一领域可能会有更多的发展。

  • 89. SetFit

    当前大多数基于 AI 的工具都是生成式的——它们生成文本和图像,使用生成式预训练模型(GPTs)来完成这些任务。而对于需要处理现有文本的用例——例如文本分类或意图识别——sentence transformers 是首选工具。在这个领域,SetFit是一个用于微调 sentence transformers 的框架。我们喜欢 SetFit 的原因是它使用对比学习来区分不同的意图类别,通常只需非常少量的样本(甚至少于 25 个)就能实现清晰的分类。sentence transformers 在生成式 AI 系统中也可以发挥作用。我们成功地在一个使用大语言模型(LLM)的客户聊天机器人系统中使用 SetFit 进行意图检测。尽管我们知晓 OpenAI 的内容审核 API,我们仍选择基于 SetFit 的分类器进行额外的微调,以实现更严格的过滤。

  • 90. vLLM

    vLLM 是一个高吞吐量、内存高效的 LLM 推理引擎,既可以在云环境中运行,也可以在本地部署。它无缝支持多种 模型架构 和流行的开源模型。我们的团队在 NVIDIA DGX 和 Intel HPC 等 GPU 平台上部署了容器化的 vLLM 工作节点,托管模型如 Llama 3.1(8B and 70B)Mistral 7BLlama-SQL ,用于开发者编码辅助、知识搜索和自然语言数据库交互。vLLM 兼容 OpenAI SDK 标准,促进了一致的模型服务。Azure 的 AI 模型目录 使用自定义推理容器来提升模型服务性能,vLLM 由于其高吞吐量和高效的内存管理,成为默认的推理引擎。vLLM 框架正在成为大规模模型部署的默认选择。

评估 ?

  • 91. Apache XTable™

    Among the available open table formats that enable lakehouses — such as Apache Iceberg, Delta and Hudi — no clear winner has emerged. Instead, we’re seeing tooling to enable interoperability between these formats. Delta UniForm, for example, enables single-directional interoperability by allowing Hudi and Iceberg clients to read Delta tables. Another new entrant to this space is Apache XTable™, an Apache incubator project that facilitates omnidirectional interoperability across Hudi, Delta and Iceberg. Like UniForm, it converts metadata among these formats without creating a copy of the underlying data. XTable could be useful for teams experimenting with multiple table formats. However, for long-term use, given the difference in the features of these formats, relying heavily on omnidirectional interoperability could result in teams only being able to use the "least common denominator" of features.

  • 92. dbldatagen

    Preparing test data for data engineering is a significant challenge. Transferring data from production to test environments can be risky, so teams often rely on fake or synthetic data instead. In this Radar, we explored novel approaches like synthetic data for testing and training models. But most of the time, lower-cost procedural generation is enough. dbldatagen (Databricks Labs Data Generator) is such a tool; it’s a Python library for generating synthetic data within the Databricks environment for testing, benchmarking, demoing and many other uses. dbldatagen can generate synthetic data at scale, up to billions of rows within minutes, supporting various scenarios such as multiple tables, change data capture and merge/join operations. It can handle Spark SQL primitive types well, generate ranges and discrete values and apply specified distributions. When creating synthetic data using the Databricks ecosystem, dbldatagen is an option worth evaluating.

  • 93. DeepEval

    DeepEval is an open-source python-based evaluation framework, for evaluating LLM performance. You can use it to evaluate retrieval-augmented generation (RAG) and other kinds of apps built with popular frameworks like LlamaIndex or LangChain, as well as to baseline and benchmark when you're comparing different models for your needs. DeepEval provides a comprehensive suite of metrics and features to assess LLM performance, including hallucination detection, answer relevancy and hyperparameter optimization. It offers integration with pytest and, along with its assertions, you can easily integrate the test suite in a continuous integration (CI) pipeline. If you're working with LLMs, consider trying DeepEval to improve your testing process and ensure the reliability of your applications.

  • 94. DSPy

    Most language model-based applications today rely on prompt templates hand-tuned for specific tasks. DSPy, a framework for developing such applications, takes a different approach that does away with direct prompt engineering. Instead, it introduces higher-level abstractions oriented around program flow (through modules that can be layered on top of each other), metrics to optimize towards and data to train or test with. It then optimizes the prompts or weights of the underlying language model based on those defined metrics. The resulting codebase looks much like the training of neural networks with PyTorch. We find the approach it takes refreshing for its different take and think it's worth experimenting with.

  • 95. Flutter for Web

    Flutter is known for its cross-platform support for iOS and Android applications. Now, it has expanded to more platforms. We've evaluated Flutter for Web previously — it allows us to build apps for iOS, Android and the browser from the same codebase. Not every web application makes sense in Flutter, but we think Flutter is particularly suited for cases like progressive web apps, single-page apps and converting existing Flutter mobile apps to the web. Flutter had already supported WebAssembly (WASM) as a compilation target in its experimental channel, which means it was under active development with potential bugs and performance issues. The most recent releases have made it stable. The performance of a Flutter web application compiled to its WASM target is far superior to its JavaScript compilation target. The near-native performance on different platforms is also why many developers initially choose Flutter.

  • 96. kotaemon

    kotaemon is an open-source RAG-based tool and framework for building Q&A apps for knowledge base documents. It can understand multiple document types, including PDF and DOC formats, and provides a web UI, based on Gradio, which allows users to organize and interact with a knowledge base via a chat interface. It has built-in RAG pipelines with a vector store and can be extended with SDKs. kotaemon also cites the source documents in its responses, along with web-based inline previews and a relevance score. For anyone wanting to do a RAG-based document Q&A application, this customizable framework is a very good starting point.

  • 97. Lenis

    Lenis is a lightweight and powerful smooth scrolling library designed for modern browsers. It enables smooth scrolling experiences, such as WebGL scroll syncing and parallax effects, which makes it ideal for teams building pages with fluid, seamless scroll interactions. Our developers found Lenis simple to use, offering a streamlined approach for creating smooth scrolls. However, the library can have issues with accessibility, particularly with vertical and horizontal scrolling interactions, which could confuse users with disabilities. While visually appealing, it needs careful implementation to maintain accessibility.

  • 98. LLMLingua

    LLMLingua enhances LLM efficiency by compressing prompts using a small language model to remove nonessential tokens with minimal performance loss. This approach allows LLMs to maintain reasoning and in-context learning while efficiently processing longer prompts, which addresses challenges like cost efficiency, inference latency and context handling. Compatible with various LLMs without additional training and supporting frameworks like LLamaIndex, LLMLingua is great for optimizing LLM inference performance.

  • 99. Microsoft Autogen

    Microsoft Autogen is an open-source framework that simplifies the creation and orchestration of AI agents, enabling multi-agent collaboration to solve complex tasks. It supports both autonomous and human-in-the-loop workflows, while offering compatibility with a range of LLMs and tools for agent interaction. One of our teams used Autogen for a client to build an AI-powered platform where each agent represented a specific skill, such as code generation, code review or summarizing documentation. The framework enabled the team to create new agents seamlessly and consistently by defining the right model and workflow. They leveraged LlamaIndex to orchestrate workflows, allowing agents to manage tasks like product search and code suggestions efficiently. While Autogen has shown promise, particularly in production environments, concerns about scalability and managing complexity as more agents are added remain. Further assessment is needed to evaluate its long-term viability in scaling agent-based systems.

  • 100. Pingora

    Pingora is a Rust framework to build fast, reliable and programmable network services. Originally developed by Cloudflare to address Nginx's shortcomings, Pingora is already showing great potential, as newer proxies like River are being built on its foundation. While most of us don't face Cloudflare's level of scale, we do encounter scenarios where flexible application-layer routing is essential for our network services. Pingora's architecture allows us to leverage the full power of Rust in these situations without sacrificing security or performance.

  • 101. Ragas

    Ragas is a framework designed to evaluate the performance of retrieval-augmented generation (RAG) pipelines, addressing the challenge of assessing both retrieval and generation components in these systems. It provides structured metrics such as faithfulness, answer relevance and context utilization which help evaluate the effectiveness of RAG-based systems. Our developers found it useful for running periodic evaluations to fine-tune parameters like top-k retrievals and embedding models. Some teams have integrated Ragas into pipelines that run daily, whenever the prompt template or the model changes. While its metrics offer solid insights, we’re concerned that the framework may not capture all the nuances and intricate interactions of complex RAG pipelines, and we recommend considering additional evaluation frameworks. Nevertheless, Ragas stands out for its ability to streamline RAG assessment in production environments, offering valuable data-driven improvements.

  • 102. Score

    Many organizations that implement their own internal development platforms tend to create their own platform orchestration systems to enforce organizational standards among developers and their platform hosting teams. However, the basic features of a paved-road deployment platform for hosting container workloads in a safe, consistent and compliant manner are similar from one organization to another. Wouldn't it be nice if we had a shared language for specifying those requirements? Score is showing some promise of becoming a standard in this space. It’s a declarative language in the form of YAML that describes how a containerized workload should be deployed and which specific services and parameters it will need to run. Score was originally developed by Humanitec as the configuration language for their Platform Orchestrator product, but it is now under custodianship of the Cloud Native Computing Foundation (CNCF) as an open-source project. With the backing of the CNCF, Score has the potential to be more widely used beyond the Humanitec product. It has been released with two reference implementations: Kubernetes and Docker Compose. The extensibility of Score will hopefully lead to community contributions for other platforms. Score certainly bears a resemblance to the Open Application Model (OAM) specification for Kubevela, but it's more focused on the deployment of container workloads than the entire application. There is also some overlap with SST, but SSI is more concerned with deployment directly into a cloud infrastructure rather than onto an internal engineering platform. We're watching Score with interest as it evolves.

  • 103. shadcn

    shadcn challenges the traditional concept of component libraries by offering reusable, copy-and-paste components that become part of your codebase. This approach gives teams full ownership and control, enabling easier customization and extension — areas where more popular conventional libraries like MUI and Chakra UI often fall short. Built with Radix UI and Tailwind CSS, shadcn integrates seamlessly into any React-based application, which makes it a good fit for projects prioritizing control and extensibility. It includes a CLI to help in the process of copying and pasting the components into your project. Its benefits also include reducing hidden dependencies and avoiding tightly coupled implementations, which is why shadcn is gaining traction as a compelling alternative for teams seeking a more hands-on, adaptable approach to front-end development.

  • 104. Slint

    Slint is a declarative GUI framework for building native user interfaces for Rust, C++ or JavaScript applications. Although it’s a multiplatform UI framework with important features such as live preview, responsive UI design, VS Code integration and a native user experience, we particularly want to highlight its usefulness for embedded systems. Teams developing embedded applications have traditionally faced a limited number of options for UI development, each with its own trade-offs. Slint offers the perfect balance between developer experience and performance, using an easy-to-use, HTML-like markup language and compiling directly to machine code. At run time, it also boasts a low-resources footprint, which is critical for embedded systems. In short, we like Slint because it brings proven practices from web and mobile development to the embedded ecosystem.

  • 105. SST

    SST is a framework for deploying applications into a cloud environment along with provisioning all the services that the application needs to run. SST is not just an IaC tool; it's a framework with a TypeScript API that enables you to define your application environment, a service that deploys your application when triggered on a Git push as well as a GUI console to manage the resulting application and invoke the SST management features. Although SST was originally based on AWS Cloud Formation and CDK, its latest version has been implemented on top of Terraform and Pulumi so that, in theory, it's cloud agnostic. SST has native support for deploying several standard web application frameworks, including Next.js and Remix, but also supports headless API applications. SST appears to be in a category of its own. While it bears some resemblance to platform orchestration tools like Kubevela, it also provides developer conveniences like a live mode that proxies AWS Lambda invocations back to a function running on the developer's local machine. Right now, SST remains a bit of a curiosity, but it is a project and part of a category of tools worth watching as it evolves.

暂缓 ?

无异常点

无法找到需要的信息?

 

每期技术雷达中的条目都在试图反映我们在过去六个月中的技术洞见,或许你所搜索的内容已经在前几期中出现过。由于我们有太多想要谈论的内容,有时候不得不剔除一些长期没有发生变化的条目。技术雷达来自于我们的主观经验,而非全面的市场分析,所以你可能会找不到自己最在意的技术条目。

下载 PDF

 

English | Español | Português | 中文

订阅技术雷达简报

 

立即订阅

查看存档并阅读往期内容