Code and Data

This tutorial gives an overview of BigGorilla and in particular, some of the basic concepts in data integration and data preparation.

This is an in-depth tutorial (with code and data about movies) to show how one typically acquires data, extract relevant information, profile and clean, match and merge datasets.

This is a simple example (with code and data) to show how one can convert wikipedia files from text to JSON format.

Example of Extracting Info from Wikipedia Pages

This is an example code that shows how titles and first paragraphs of selected wikipedia articles are extracted and stored in a json file.

Example of Matching Schemas with Flexmatcher

This is an example code that shows how different schemas can be matched to a mediated schema using BigGorilla’s FlexMatcher package.

Example of Scraping Restaurant Reviews

This is an example code that uses the package Scrapy to scrape reviews from multiple pages from a website.

:)