Plataformas
Adoptar
-
23. GitLab CI/CD
GitLab CI/CD ha evolucionado hasta convertirse en un sistema completamente integrado en GitLab, cubriendo todo desde la integración y testeo de código hasta su despliegue y monitorización. Soporta complejos flujos de trabajo con funcionalidades como pipelines multi-etapa, caché, ejecución paralela y auto-escalado de ejecutores, adecuado para proyectos de gran escala y necesidades en pipelines complejas. Queremos destacar sus herramientas de seguridad y cumplimiento integradas (como análisis SAST y DAST), que lo hacen ideal para casos de uso con altos requisitos de cumplimiento. También se integra perfectamente con Kubernetes, dando soporte a flujos de trabajo nativos de la nube, y ofrece logging en tiempo real, reportes de tests y trazabilidad para una mejor observabilidad.
-
24. Trino
Trino es un motor de consultas SQL distribuido y de código abierto diseñado para consultas analíticas interactivas sobre big data. Está optimizado para ejecutarse tanto en entornos locales como en la nube, y permite consultar los datos donde residen, incluyendo bases de datos relacionales y diversos almacenes propietarios a través de conectores. Trino también puede consultar datos almacenados en formatos Parquet y formatos de tabla abiertos como Apache Iceberg. Sus capacidades integradas de federación de consultas permiten consultar datos de múltiples fuentes como si fueran una sola tabla lógica, lo que lo convierte en una excelente opción para cargas de trabajo analíticas que requieren agregar datos de diferentes orígenes. Trino es una parte clave de stacks populares como AWS Athena, Starburst y otras plataformas de datos protegidos. Nuestros equipos lo han utilizado con éxito en varios casos de uso, y cuando se trata de consultar conjuntos de datos de múltiples fuentes para análisis, Trino ha sido una opción confiable.
Probar
-
25. ABsmartly
ABsmartly es una plataforma avanzada de pruebas A/B, diseñada para tomar decisiones rápidas y confiables. Su característica más destacada es el motor de Pruebas Secuenciales Grupales (GST), que acelera los resultados de las pruebas hasta un 80% comparado con herramientas tradicionales de pruebas A/B. La plataforma ofrece informes en tiempo real, segmentación profunda de datos e integración completa y fluida en todo el ecosistema tecnológico mediante un enfoque API-first, lo que permite realizar experimentos en web, aplicaciones móviles, microservicios y modelos de Machine Learning (ML).
ABsmartly solventa problemas clave en experimentación escalable y basada en datos al permitir una iteración más rápida y un desarrollo de productos más ágil. Su ejecución con latencia cero, capacidad de segmentación profunda y soporte para experimentos en múltiples plataformas la convierten en una herramienta valiosa para organizaciones que buscan expandir su cultura de experimentación y priorizar la innovación basada en datos. Al reducir significativamente los ciclos de prueba y automatizar el análisis de resultados, ABsmartly nos ayudó a optimizar características y experiencias de usuario de manera más eficiente que las plataformas de pruebas A/B tradicionales.
-
26. Dapr
Dapr ha evolucionado significativamente desde la última vez que lo destacamos en nuestro Radar. Sus numerosas nuevas funcionalidades incluyen programación de tareas, actores virtuales, políticas de reintento más sofisticadas y componentes de observabilidad. Su catálogo de componentes sigue expandiéndose con nuevas capacidades como gestión de tareas, criptografía y mucho más. Nuestros equipos también destacan su creciente enfoque en configuraciones predeterminadas de seguridad, con soporte para mTLS e imágenes distroless. En general, hemos quedado satisfechos con Dapr y esperamos con interés su evolución futura.
-
27. Grafana Alloy
Anteriormente conocido como Grafana Agent, Grafana Alloy es un colector de código abierto de OpenTelemetry. Alloy está diseñado para ser un colector de telemetría todo en uno para todos los datos de telemetría, incluyendo logs, métricas y trazas. Admite la recopilación de formatos de datos de telemetría comúnmente utilizados, como OpenTelemetry, Prometheus y Datadog. Con la reciente deprecación de Promtail, Alloy está emergiendo como la opción ideal para la recopilación de datos de telemetría, especialmente para logs, al usar la stack de observabilidad de Grafana.
-
28. Grafana Loki
Grafana Loki es un sistema de agregación de logs multi-tenant, altamente disponible y escalable horizontalmente, inspirado en Prometheus. Loki solo indexa metadatos sobre tus logs como un conjunto de etiquetas para cada flujo de logs. Los datos de logs se almacenan en una solución de almacenamiento de bloques, como S3, GCS o Azure Blob Storage. El resultado que Loki promete es una reducción en la complejidad operativa y en los costos de almacenamiento en comparación con sus competidores. Como era de esperarse, se integra estrechamente con Grafana y Grafana Alloy, aunque se pueden usar otros mecanismos de recolección.
Loki 3.0 introdujo soporte nativo para OpenTelemetry, lo que hace que la ingesta y la integración con sistemas OpenTelemetry sea tan simple como configurar un endpoint. También ofrece características avanzadas de multi-tenant, como el aislamiento de tenants mediante shuffle-sharding, lo que evita que los tenants con comportamiento anómalo (por ejemplo, consultas pesadas o caídas) afecten a otros en un cluster. Si no has estado siguiendo los desarrollos en el ecosistema de Grafana, ahora es un buen momento para revisarlo, ya que está evolucionando rápidamente.
-
29. Grafana Tempo
Grafana Tempo es un backend de rastreo distribuído a gran escala, compatible con estándares abiertos como OpenTelemetry, diseñado para ser eficiente en cuanto a costes. Se basa en el almacenamiento de objetos para retener las trazas a largo plazo, lo que permite buscarlas, generar métricas span-based, la correlación con logs y métricas. Por defecto, Tempo utiliza un formato de bloque columnar basado en Apache Parquet, lo que mejora el rendimiento de las consultas y permite a otras herramientas acceder a los datos de rastreo. Las consultas se ejecutan a través de TraceQL y Tempo CLI. También se puede configurar Grafana Alloy para recopilar y enviar trazas a Tempo. Nuestros equipos alojaron internamente Tempo en GKE, utilizando MinIO para el almacenamiento de objetos, recolectores de OpenTelemetry y Grafana para la visualización de las trazas.
-
30. Railway
Heroku solía ser una excelente opción para muchos desarrolladores que querían lanzar y desplegar sus aplicaciones rápidamente. En los últimos años, también hemos visto el auge de plataformas de despliegue como Vercel, que son más modernas, ligeras y fáciles de usar, pero diseñadas para aplicaciones front-end. Una alternativa full-stack en este ámbito es Railway, una plataforma en la nube tipo PaaS que simplifica todo, desde el despliegue con GitHub/Docker hasta la monitorización en producción.
Railway es compatible con la mayoría de los frameworks de programación más utilizados, bases de datos y despliegue mediante contenedores. Como plataforma de alojamiento a largo plazo para una aplicación, es importante comparar los costes entre diferentes plataformas de forma detallada. Actualmente, nuestro equipo tiene una buena experiencia con el despliegue y la observabilidad en Railway. Su operación es fluida y se integra bien con las prácticas de despliegue continuo que promovemos.
-
31. Unblocked
Unblocked es un asistente de IA para equipos listo para usar. Una vez integrado con repositorios de código, plataformas de documentación corporativa, herramientas de gestión de proyectos y herramientas de comunicación, Unblocked ayuda a responder preguntas complejas sobre conceptos técnicos y de negocio, diseño e implementación arquitectónica, así como procesos operativos. Esto es particularmente útil para navegar sistemas heredados (legacy) o muy grandes. Durante el uso de Unblocked, hemos observado que los equipos valoran el acceso rápido a la información contextual por encima de la generación de código e historias de usuario; para tales escenarios, especialmente los asistentes de programación, asistentes de ingeniería de software están mejor preparados.
-
32. Weights & Biases
Weights & Biases ha seguido evolucionando, añadiendo más funcionalidades centradas en LLM desde su última aparición en el Radar. Están ampliando Traces e introduciendo Weave, una plataforma completa que va más allá del seguimiento de sistemas basados en agentes LLM. Weave te permite crear evaluaciones de sistemas, definir métricas personalizadas, usar los LLMs como jueces para tareas como hacer resúmenes y guardar conjuntos de datos que capturan diferentes comportamientos para su análisis. Esto ayuda a optimizar los componentes LLM y a hacer un seguimiento del rendimiento tanto a nivel local como global. La plataforma también facilita el desarrollo iterativo y la depuración efectiva de sistemas basados en agentes, donde los errores pueden ser difíciles de detectar. Además, permite la recopilación de la valiosa retroalimentación humana, que puede utilizarse posteriormente para reajustar los modelos.
Evaluar
-
33. Arize Phoenix
With the popularity of LLM and agentic applications, LLM observability is becoming more and more important. Previously, we’ve recommended platforms such as Langfuse and Weights & Biases (W&B). Arize Phoenix is another emerging platform in this space, and our team has had a positive experience using it. It offers standard features like LLM tracing, evaluation and prompt management, with seamless integration into leading LLM providers and frameworks. This makes it easy to gather insights on LLM output, latency and token usage with minimal configuration. So far, our experience is limited to the open-source tool but the broader Arize platform offers more comprehensive capabilities. We look forward to exploring it in the future.
-
34. Chainloop
Chainloop is an open-source supply chain security platform that helps security teams enforce compliance while allowing development teams to seamlessly integrate security compliance into CI/CD pipelines. It consists of a control plane, which acts as the single source of truth for security policies, and a CLI, which runs attestations within CI/CD workflows to ensure compliance. Security teams define workflow contracts specifying which artifacts — such as SBOMs and vulnerability reports — must be collected, where to store them and how to evaluate compliance. Chainloop uses Rego, OPA's policy language, to validate attestations — for example, ensuring a CycloneDX SBOM meets version requirements. During workflow execution, security artifacts like SBOMs are attached to an attestation and pushed to the control plane for enforcement and auditing. This approach ensures compliance can be enforced consistently and at scale while minimizing friction in development workflows. This results in an SLSA level-three–compliant single source of truth for metadata, artefacts and attestations.
-
35. Deepseek R1
DeepSeek-R1 is DeepSeek's first-generation of reasoning models. Through a progression of non-reasoning models, the engineers at DeepSeek designed and used methods to maximize hardware utilization. These include Multi-Head Latent Attention (MLA), Mixture of Experts (MoE) gating, 8-bit floating points training (FP8) and low-level PTX programming. Their high-performance computing co-design approach enables DeepSeek-R1 to rival state-of-the-art models at significantly reduced cost for training and inference.
DeepSeek-R1-Zero is notable for another innovation: the engineers were able to elicit reasoning capabilities from a non-reasoning model using simple reinforcement learning without any supervised fine-tuning. All DeepSeek models are open-weight, which means they are freely available, though training code and data remain proprietary. The repository includes six dense models distilled from DeepSeek-R1, based on Llama and Qwen, with DeepSeek-R1-Distill-Qwen-32B outperforming OpenAI-o1-mini on various benchmarks.
-
36. Deno
Created by Ryan Dahl, the inventor of Node.js, Deno was designed to address what he saw as mistakes in Node.js. It features a stricter sandboxing system, built-in dependency management and native TypeScript support — a key draw for its user base. Many of us prefer Deno for TypeScript projects, as it feels like a true TypeScript run time and toolchain, rather than an add-on to Node.js.
Since its inclusion in the Radar in 2019, Deno has made significant advancements. The Deno 2 release introduces backward compatibility with Node.js and npm libraries, long-term support (LTS) releases and other improvements. Previously, one of the biggest barriers to adoption was the need to rewrite Node.js applications. These updates reduce migration friction while expanding dependency options for supporting tools and systems. Given the massive Node.js and npm ecosystem, these changes should drive further adoption.
Additionally, Deno’s Standard Library has stabilized, helping combat the proliferation of low-value npm packages across the ecosystem. Its tooling and Standard Library make TypeScript or JavaScript more appealing for server-side development. However, we caution against choosing a platform solely to avoid polyglot programming.
-
37. Graphiti
Graphiti builds dynamic, temporally-aware knowledge graphs that capture evolving facts and relationships. Our teams use GraphRAG to uncover data relationships, which enhances retrieval and response accuracy. As data sets constantly evolve, Graphiti maintains temporal metadata on graph edges to record relationship lifecycles. It ingests both structured and unstructured data as discrete episodes and supports queries using a fusion of time-based, full-text, semantic and graph algorithms. For LLM-based applications — whether RAG or agentic — Graphiti enables long-term recall and state-based reasoning.
-
38. Helicone
Similar to Langfuse, Weights & Biases and Arize Phoenix, Helicone is a managed LLMOps platform designed to meet the growing enterprise demand for LLM cost management, ROI evaluation and risk mitigation. Open-source and developer-focused, Helicone supports production-ready AI applications, offering prompt experimentation, monitoring, debugging and optimization across the entire LLM lifecycle. It enables real-time analysis of costs, utilization, performance and agentic stack traces across various LLM providers. While it simplifies LLM operations management, the platform is still emerging and may require some expertise to fully leverage its advanced features. Our team has been using it with good experience so far.
-
39. Humanloop
Humanloop is an emerging platform focused on making AI systems more reliable, adaptable and aligned with user needs by integrating human feedback at key decision points. It offers tools for human labeling, active learning and human-in-the-loop fine-tuning as well as LLM evaluation against business requirements. Additionally, it helps manage the cost-effective lifecycle of GenAI solutions with greater control and efficiency. Humanloop supports collaboration through a shared workspace, version-controlled prompt management and CI/CD integration to prevent regressions. It also provides observability features such as tracing, logging, alerting and guardrails to monitor and optimize AI performance. These capabilities make it particularly relevant for organizations deploying AI in regulated or high-risk domains where human oversight is critical. With its focus on responsible AI practices, Humanloop is worth evaluating for teams looking to build scalable and ethical AI systems.
-
40. Model Context Protocol (MCP)
One of the biggest challenges in prompting is ensuring the AI tool has access to all the context relevant to the task. Often, this context already exists within the systems we use all day: wikis, issue trackers, databases or observability systems. Seamless integration between AI tools and these information sources can significantly improve the quality of AI-generated outputs.
The Model Context Protocol (MCP), an open standard released by Anthropic, provides a standardized framework for integrating LLM applications with external data sources and tools. It defines MCP servers and clients, where servers access the data sources and clients integrate and use this data to enhance prompts. Many coding assistants have already implemented MCP integration, allowing them to act as MCP clients. MCP servers can be run in two ways: Locally, as a Python or Node process running on the user’s machine, or remotely, as a server that the MCP client connects to via SSE (though we haven't seen any usage of the remote server variant yet). Currently, MCP is primarily used in the first way, with developers cloning open-source MCP server implementations. While locally run servers offer a neat way to avoid third-party dependencies, they remain less accessible to nontechnical users and introduce challenges such as governance and update management. That said, it's easy to imagine how this standard could evolve into a more mature and user-friendly ecosystem in the future.
-
41. Open WebUI
Open WebUI is an open-source, self-hosted AI platform with a versatile feature set. It supports OpenAI-compatible APIs and integrates with providers like OpenRouter and GroqCloud, among others. It can run entirely offline by connecting to local or self-hosted models via Ollama. Open WebUI includes a built-in capability for RAG, allowing users to interact with local and web-based documents in a chat-driven experience. It offers granular RBAC controls, enabling different models and platform capabilities for different user groups. The platform is extensible through Functions — Python-based building blocks that customize and enhance its capabilities. Another key feature is model evaluation, which includes a model arena for side-by-side comparisons of LLMs on specific tasks. Open WebUI can be deployed at various scales — as a personal AI assistant, a team collaboration assistant or an enterprise-grade AI platform.
-
42. pg_mooncake
pg_mooncake is a PostgreSQL extension that adds columnar storage and vectorized execution. Columnstore tables are stored as Iceberg or Delta Lake tables in the local file system or S3-compatible cloud storage. pg_mooncake supports loading data from file formats like Parquet, CSV and even Hugging Face datasets. It can be a good fit for heavy data analytics that typically requires columnar storage, as it removes the need to add dedicated columnar store technologies into your stack.
-
43. Reasoning models
One of the most significant AI advances since the last Radar is the breakthrough and proliferation of reasoning models. Also marketed as "thinking models," these models have achieved top human-level performance in benchmarks like frontier mathematics and coding.
Reasoning models are usually trained through reinforcement learning or supervised fine-tuning, enhancing capabilities such as step-by-step thinking (CoT), exploring alternatives (ToT) and self-correction. Examples include OpenAI’s o1/o3, DeepSeek R1 and Gemini 2.0 Flash Thinking. However, these models should be seen as a distinct category of LLMs rather than simply more advanced versions.
This increased capability comes at a cost. Reasoning models require longer response time and higher token consumption, leading us to jokingly call them "Slower AI" (as if current AI wasn’t slow enough). Not all tasks justify this trade-off. For simpler tasks like text summarization, content generation or fast-response chatbots, general-purpose LLMs remain the better choice. We advise using reasoning models in STEM fields, complex problem-solving and decision-making — for example, when using LLMs as judges or improving explainability through explicit CoT outputs. At the time of writing, Claude 3.7 Sonnet, a hybrid reasoning model, had just been released, hinting at a possible fusion between traditional LLMs and reasoning models.
-
44. Restate
Restate is a durable execution platform, similar to Temporal, developed by the original creators of Apache Flink. Feature-wise it offers workflows as code, stateful event processing, the saga pattern and durable state machines. Written in Rust and deployed as a single binary, it uses a distributed log to record events, implemented using a virtual consensus algorithm based on Flexible Paxos; this ensures durability in the event of node failure. SDKs are available for the usual suspects: Java, Go, Rust and TypeScript. We still maintain that it's best to avoid distributed transactions in distributed systems, because of both the additional complexity and the inevitable additional operational overhead involved. However, this platform is worth assessing if you can’t avoid distributed transactions in your environment.
-
45. Supabase
Supabase is an open-source Firebase alternative for building scalable and secure backends. It offers a suite of integrated services, including a PostgreSQL database, authentication, instant APIs, Edge Functions, real-time subscriptions, storage and vector embeddings. Supabase aims to streamline back-end development, allowing developers to focus on building front-end experiences while leveraging the power and flexibility of open-source technologies. Unlike Firebase, Supabase is built on top of PostgreSQL. If you're working on prototyping or an MVP, Supabase is worth considering, as it will be easier to migrate to another SQL solution after the prototyping stage.
-
46. Synthesized
A common challenge in software development is generating test data for development and test environments. Ideally, test data should be as production-like as possible, while ensuring no personally identifiable or sensitive information is exposed. Though this may seem straightforward, test data generation is far from simple. That's why we’re interested in Synthesized — a platform that can mask and subset existing production data or generate statistically relevant synthetic data. It integrates directly into build pipelines and offers privacy masking, providing per-attribute anonymization through irreversible data obfuscation techniques such as hashing, randomization and binning. Synthesized can also generate large volumes of synthetic data for performance testing. While it includes the obligatory GenAI features, its core functionality addresses a real and persistent challenge for development teams, making it worth exploring.
-
47. Tonic.ai
Tonic.ai is part of a growing trend in platforms designed to generate realistic, de-identified synthetic data for development, testing and QA environments. Similar to Synthesized, Tonic.ai is a platform with a comprehensive suite of tools addressing various data synthesis needs in contrast to the library-focused approach of Synthetic Data Vault. Tonic.ai generates both structured and unstructured data, maintaining the statistical properties of production data while ensuring privacy and compliance through differential privacy techniques. Key features include automatic detection, classification and redaction of sensitive information in unstructured data, along with on-demand database provisioning via Tonic Ephemeral. It also offers Tonic Textual, a secure data lakehouse that helps AI developers leverage unstructured data for retrieval-augmented generation (RAG) systems and LLM fine-tuning. Teams looking to accelerate engineering velocity while generating scalable, realistic data — all while adhering to stringent data privacy requirements — should consider evaluating Tonic.ai.
-
48. turbopuffer
turbopuffer is a serverless, multi-tenant search engine that seamlessly integrates vector and full-text search on object storage. We quite like its architecture and design choices, particularly its focus on durability, scalability and cost efficiency. By using object storage as a write-ahead log while keeping its query nodes stateless, it’s well-suited for high-scale search workloads.
Designed for performance and accuracy, turbopuffer delivers high recall out of the box, even for complex filter-based queries. It caches cold query results on NVMe SSDs and keeps frequently accessed namespaces in memory, enabling low-latency search across billions of documents. This makes it ideal for large-scale document retrieval, vector search and retrieval-augmented generation (RAG) AI applications. However, its reliance on object storage introduces trade-offs in query latency, making it most effective for workloads that benefit from stateless, distributed compute. turbopuffer powers high-scale production systems like Cursor but is currently only available by referral or invitation.
-
49. VectorChord
VectorChord is a PostgreSQL extension for vector similarity search, developed by the creators of pgvecto.rs as its successor. It’s open source, compatible with pgvector data types and designed for disk-efficient, high-performance vector search. It employs inverted file indexing (IVF) along with RaBitQ quantization to enable fast, scalable and accurate vector search while significantly reducing computation demands. Like other PostgresSQL extensions in this space, it leverages the PostgreSQL ecosystem, allowing vector search alongside standard transactional operations. Though still in its early stages, VectorChord is worth assessing for vector search workloads.
Resistir
-
50. Tyk hybrid API management
We've observed multiple teams encountering issues with the Tyk hybrid API management solution. While the concept of a managed control plane and self-managed data planes offers flexibility for complex infrastructure setups (such as multi-cloud and hybrid cloud), teams have experienced control plane incidents that were only discovered internally rather than by Tyk, highlighting potential observability gaps in Tyk's AWS-hosted environment. Furthermore, the level of incident support appears slow; communicating via tickets and emails isn’t ideal in these situations. Teams have also reported issues with the maturity of Tyk's documentation, often finding it inadequate for complex scenarios and issues. Additionally, other products in the Tyk ecosystem seem immature as well, for example, the enterprise developer portal is reported to not be backward compatible and has limited customization capabilities. Especially for Tyk’s hybrid setup, we recommend proceeding with caution and will continue to monitor its maturity.
¿No encontraste algo que esperabas ver?
Cada edición del Radar presenta noticias que reflejan lo que hemos encontrado durante los seis meses anteriores. Es posible que ya hayamos cubierto lo que busca en un Radar anterior. A veces seleccionamos cosas simplemente porque hay demasiadas de las que hablar. También es posible que falte algún dato porque el Radar refleja nuestra experiencia, no se basa en un análisis exhaustivo del mercado.
