By David Tan, Keith Schulze and Mitchell Lisle
In a previous chapter, we took a deep dive into the engineering practices that will help you deliver your data products quickly, safely and sustainably. Now we’ll explore delivery planning principles to guide how we shape and sequence our work. These enable teams to add incremental value, de-risk large projects, and find opportunities for continuous improvement when delivering data-intensive products.
Note: We won’t touch on typical iteration planning and release planning activities, such as story estimates and tracking cycle time, as they are becoming increasingly common in the industry. Those practices are still important, and we want to complement that by sharing additional principles and practices that provide an intentional focus on creating value.
Principle 1: Vertical thin slicing
One common pitfall in data engineering is the bottom-up sequential delivery of functional layers of a technical solution (think data lake, data warehouse, machine learning pipelines, and user-facing applications) – or horizontal slicing.
The downside of this approach is that users can only provide valuable feedback after significant investments of time and effort. It also leads to late integration issues when horizontal slices eventually come together, increasing the risk of release delays.
So, how can we plan and sequence our work to release value early – and often? How can we avoid the quicksand of data engineering for the sake of data engineering, and create a cadence of demonstrable business value in every iteration? The answer is by flipping our thinking, and slicing work vertically.
Thin vertical slices help to ensure that:
At the level of stories, you articulate and demonstrate business value in all stories, and ensure that the majority of stories are independently shippable units of value.
- At the level of iterations, you regularly demonstrate value to users by delivering a collection of vertically sliced stories within a reasonable timeframe.
- At the level of releases, you plan, sequence and prioritize a collection of stories that’s oriented towards creating demonstrable business value.
Figure 1: Delivering early with thin vertical slices
Tracing a thin slice of value through the components of a data ecosystem and delivering value in vertically sliced user stories enables you to test and learn more cost-effectively, solve customer problems more efficiently, and deliver value sooner. We will describe what this looks like in action in the case study below.
Principle 2: Data-driven hypothesis development
When embarking on data products, we often find ourselves solving problems with little data (that’s why we’re investing in data engineering!) and high levels of risks (the “known unknowns” and “unknown unknowns”, see Figure 2). In this scenario, we should focus on finding the shortest path to the right solutions, and eliminating ‘incorrect’ solutions as soon as possible.
Figure 2: Four problem categories
Data-driven hypothesis development (DDHD) is an effective way to approach these problems. Hypothesis testing is a powerful tool that can help to de-risk a large piece of work, and should be used not just before, but also during, delivery.
In essence, DDHD is about formulating hypotheses, running small experiments with clear outcomes and criteria and using the data we collect to tell stories and share your lessons with your team, business and stakeholders. The case study below will illustrate this in action, but for now this is what a hypothesis looks like:
- We believe that <this capability>
- Will result in <this outcome>
- We will know we have succeeded when <we see a measurable signal>.
DDHD creates a space and a framework for teams to run short experiments, to learn as we deliver value incrementally, and to continuously apply our lessons learned to have greater impact and reduce risks of costly and unvalidated investments.
Principle 3: Measuring delivery metrics
Research has found that high-performing technology organizations perform well in four key metrics:
- delivery lead time
- deployment frequency
- mean time to restore service
- change fail percentage
The four key metrics provide insight into the flow and friction of value delivery, and they’re a great starting point for identifying what is working well, and what needs to be improved. For organizations that do not have the platform tooling that allows teams to measure the four key metrics, you can get started simply by surveying teams regularly with DORA’s quick check tool. Even though the precision is not as high, this method gives you a first indication of where your organization stands, and what the trends are.
In addition, you should also measure outcomes-oriented metrics that are relevant to your organization, such as improvement in efficiency and customer satisfaction. Outcomes-oriented metrics help to align teams towards activities that contribute to organizational goals rather than busywork. The diagram below gives an example of what outcomes-oriented metrics might look like in the insurance domain.
Figure 4: Measuring delivery metrics and outcomes-oriented metrics (for example: improvement in customer experience, improvement in efficiency) help teams to focus on impactful work, rather than busy work
However, these metrics should be undergirded by a healthy organizational culture and value-oriented mindset. Otherwise, we can fall into the trap of dysfunctional metrics. As Godhart’s law states: “When a measure becomes a target, it ceases to be a good measure.” And keep in mind Hawthorne effect – when your team knows they’re being measured, there might be a reflex to bend the rules and find loopholes to meet the targets.
Case study: Applying delivery planning principles to shorten feedback cycles and create the right product
A B2B company, let’s call it Company X, wanted to help its customers run and grow their business with a new financial service offering. Company X knew that by using the historical transaction data it had from its customers, it could offer a better lending experience than other financial service providers. So we worked with Company X to create a customer credit history data product that enables it to recommend suitable financial products to its customers. Customers that consented to Company X using their data for this product would have a smoother experience in getting an appropriate financing for their business retail purchases.
Vertical slicing in action
Company X did not have a data platform for extracting, processing and creating new data products – the data we needed to build this was siloed in transactional datastores in production.
Rather than wait for a data platform to be completed (i.e. horizontal slicing), we applied lean product development and data mesh principles to build a data product that provides real customer value in the short term, while supporting extensibility towards a shared platform in the medium to long term. This allowed us to deliver a low-risk solution fast (in just over 4 months), while also providing valuable insight into an ongoing data platform development effort.
Figure 5: Build measure learn in context. A Lean approach to product delivery
Phase 1: Discovery
With the help of domain experts, we refined Company X’s business requirements to outline data sources needed for the credit history product. We found we only needed seven (out of 150) tables from the source database to deliver the minimum requirements. This reduced data ingestion efforts, as we didn’t need to process or clean unnecessary data. Over 6 weeks, we also refined the features and cross-functional requirements of the customer credit history data product, and aligned on the intended business value.
We articulated hypotheses to help us find the shortest path to the ‘correct’ solutions. These hypotheses helped us stay on the right track towards our goal of building the right product. For example, we could validate our approach by running an experiment and collecting data on one of our hypothesis:
- We believe that establishing an automated rule-based pre-screen based on various dimensions of a customer’s transaction history
- Will result in a scalable way of identifying creditworthy customers
- We will know we have succeeded when an automated pre-screen application is able to reject non-creditworthy customers with a X% margin of error relative to credit assessments done by professionally trained domain experts.
Phase 2: Delivery
Once everyone was aligned on the product form and value it should deliver, we started developing a minimum viable product (MVP). Scoping an MVP can be difficult. We aimed for the thinnest ‘vertical’ slice that provided feedback about the viability of the data product, and ensured that it was close enough to the final product from a customer perspective to continually test our hypotheses. The MVP also uncovered potential edge cases, hidden or missed product opportunities, and possible obstacles. This early feedback helped to identify risks and where we can focus our risk mitigation efforts when further developing the product.
It also helped to define the data sources and transformations that we could leverage when iterating on future releases of the product. Our focus for the production delivery phase was to implement well-governed transformations on supportable and extensible data infrastructure – and serving the results to data product consumers. We applied our sensible default engineering practices, such as test-driven development (TDD), infrastructure as code, CI/CD, observability in both code and data planes, among others.
Delivering an independent, comprehensive data product
Within 10 iterations (over 4 months), we delivered a consumable, fully automated and comprehensively governed data product with no dependency on a centralized data platform. The team also measured the four key metrics (e.g. delivery lead time, change fail rate) and other delivery metrics (e.g. velocity, burnup rate, etc) to provide insight into how we’re progressing towards the goal. The metrics helped us recalibrate delivery parameters where necessary.
Amplify impact, reduce time to delivery
These practices have helped Thoughtworks deliver value for clients time and again, and are sensible defaults that we bring to every data engagement to accelerate delivery and bring extraordinary impact. Wherever you are on your delivery journey right now, you can chart a path towards delivery success by:
Building awareness: Are there any gaps or opportunities in your current delivery planning practices?
Being open to what needs to change: How would you apply the principles and practices outlined in this chapter to help you improve your delivery planning?
Executing the change: Connect industry-tested recommended practices with practical experience in successfully delivering data products.
In the next chapter, we’ll share how you can save hours by better managing the quality of your data.