Use this component when you wish to build a pipeline of your data integration and data preparation tasks, automate the pipeline, handle input/output data, failure etc.
We list some of the workflow management tools (in no particular order) below:
- Luigi is a Python package for building complex pipelines of jobs. It comes with a graphical interface for visualizing the progress of the workflow, the dependencies, managing failures and more.
- There is of course our good old GNU Make, which allows one to write a makefile to compute files based on other files. The order of at which the files are executed is automatically determined. Make understands which files need to be executed depending on which source files have been modified. Make can also be used to build and install programs.
*There are more cool tools to add to the list? Tell us about it.