Metadata-Version: 2.1
Name: data-check
Version: 0.2.1
Summary: simple data validation
Home-page: https://andrjas.github.io/data_check/
License: MIT
Keywords: data,validation,testing,quality
Author: Andreas Rjasanow
Author-email: andrjas@gmail.com
Requires-Python: >=3.6.1,<4.0.0
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Other Audience
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Topic :: Database
Classifier: Topic :: Software Development
Classifier: Topic :: Software Development :: Testing
Provides-Extra: mssql
Provides-Extra: mysql
Provides-Extra: oracle
Provides-Extra: postgres
Requires-Dist: SQLAlchemy (>=1.3.22,<1.4.0)
Requires-Dist: click (>=7.1.2,<7.2.0)
Requires-Dist: colorama (>=0.4.4,<0.5.0)
Requires-Dist: cx_Oracle (>=8.1.0,<8.2.0); extra == "oracle"
Requires-Dist: importlib-metadata (>=3.4.0,<4.0.0)
Requires-Dist: numpy (>=1.19.5,<1.20.0)
Requires-Dist: pandas (>=1.1.5,<1.2.0)
Requires-Dist: psycopg2-binary (>=2.8.6,<2.9.0); extra == "postgres"
Requires-Dist: pymysql[rsa]; extra == "mysql"
Requires-Dist: pyodbc (>=4.0.30,<4.1.0); extra == "mssql"
Requires-Dist: pyyaml (>=5.3.1,<6.0.0)
Project-URL: Repository, https://github.com/andrjas/data_check
Description-Content-Type: text/markdown

# data_check

data_check is a simple data validation tool. Write SQL queries and CSV files with the expected result sets and data_check will test the result sets against the queries.

data_check should work with any database that works with [SQLAlchemy](https://docs.sqlalchemy.org/en/13/dialects/). Currently data_check is tested against PostgreSQL, MySQL, SQLite, Oracle and Microsoft SQL Server.

## Quickstart

You need Python 3.6 or above to run data_check. The easiest way to install data_check is via [pipx](https://github.com/pipxproject/pipx):

`pipx install data_check`

The data_check Git repository is also a sample data_check project. Clone the repository, switch to the folder and run data_check:

```
git clone git@github.com:andrjas/data_check.git
cd data_check
data_check
```

This will run the tests in the _checks_ folder using the default connection as set in data_check.yml.

See the [documentation](https://andrjas.github.com/data_check) how to install data_check in different environments with additional database drivers and other usages of data_check.

## Project layout

data_check has a simple layout for projects: a single configuration file and a folder with the test files. You can also organize the test files in subfolders.

    data_check.yml    # The configuration file
    checks/           # Default folder for data tests
        some_test.sql # SQL file with the query to run against the database
        some_test.csv # CSV file with the expected result
        subfolder/    # Tests can be nested in subfolders

## Documentation

See the [documentation](https://andrjas.github.com/data_check) how to setup data_check, how to create a new project and more options.

