Enable javascript in your browser for better experience. Need to know to enable it? Go here.
Published : Apr 26, 2023
NOT ON THE CURRENT EDITION
This blip is not on the current edition of the Radar. If it was on one of the last few editions, it is likely that it is still relevant. If the blip is older, it might no longer be relevant and our assessment might be different today. Unfortunately, we simply don't have the bandwidth to continuously review blips from previous editions of the Radar. Understand more
Apr 2023
Assess ?

In previous Radars, we've featured data validation and testing platforms like Great Expectations that can be used to validate assumptions and test the quality of incoming data used for training or classification. Sometimes, though, all you need is a simple code library to implement tests and quality checks directly in pipelines. pandera is a Python library for testing and validating data across a wide range of frame types such as pandas, Dask or PySpark. pandera can implement simple assertions about fields or hypothesis tests based on statistical models. The wide range of supported frame libraries means tests can be written once and then applied to a variety of underlying data formats. pandera can also be used to generate synthetic data to test ML models.

Download the PDF

 

 

English | Español | Português | 中文

Sign up for the Technology Radar newsletter

 

Subscribe now

Visit our archive to read previous volumes