Data Integration and Data Preparation

  • OPEN-SOURCE COMPONENTS FOR DATA INTEGRATION AND DATA PREPARATION

  • CAN BE COMBINED AND REUSED IN DIFFERENT CONTEXTS AND DIFFERENT WAYS

  • TUTORIALS, DATASETS, AND EXAMPLES

Data Acquisition, Extraction, and Cleaning

Use this component when you wish to acquire data from other sources or extract structured data from text. Most tools in this component include data cleaning components to, for example, detect and/or correct inconsistent data.

Entity Matching

Use this component when you wish to identify when two entities are the same entity or when they are related in some ways.

Schema Matching and Mapping

Use this component when you wish to match attributes across two schemas or when you wish to generate scripts (from schema matchings) that can be executed to transform data from one format into another.

Additional Data Preparation Tools

This component contains additional data preparation tools such as tools for building and automating a workflow of tasks or tools for converting data from one format into another.