Published : May 19, 2020
NOT ON THE CURRENT EDITION
This blip is not on the current edition of the Radar. If it was on one of the last few editions, it is likely that it is still relevant. If the blip is older, it might no longer be relevant and our assessment might be different today. Unfortunately, we simply don't have the bandwidth to continuously review blips from previous editions of the Radar.
Understand more
May 2020
Assess
There are still some tool gaps when applying good software engineering practices in data engineering. Attempting to automate data quality checks between different steps in a data pipeline, one of our teams was surprised when they found only a few tools in this space. They settled on Deequ, a library for writing tests that resemble unit tests for data sets. Deequ is built on top of Apache Spark, and even though it's published by AWS Labs it can be used in environments other than AWS.