Bach

Orchestrate a cluster of preemptible virtual machines on google compute engine.

Prerequisites

Node.js
- Installing node and npm
- Running as a command line tool
Docker
- Setting up docker to run on local machine
- Setting up a docker host on a local subnet
Google Cloud CLI
- Get started with Google Cloud
Slave docker/vm image

Installation

Install via npm

npm i <tbc> -g

Install from source

Linking will allow changes made to thee source code to be immediately reflected in the tool.

git clone https://github.com/conorturner/bach.git && \
cd bach && \
npm link

Usage

The bachfile

Applications are defined using a 'bachfile', this specifies the location of the binary file to be run in the computation. It also contains a definition of the hardware requirements for each slave node.

Map Reduce

This use case supports a basic map and collect phase reading from any HTTP storage supporting the 'range' header. Documentation is available here.

Stream Processing

Documentation is available here.

Interesting Datasets

Good source of datasets: https://registry.opendata.aws/

US IRS filings https://registry.opendata.aws/irs990/ https://s3.amazonaws.com/irs-form-990/index_20xx.json

Massive web crawl database https://registry.opendata.aws/commoncrawl/

Nexrad weather satellite data https://docs.opendata.aws/noaa-nexrad/readme.html Data can be searched byprefix as shown below https://noaa-nexrad-level2.s3.amazonaws.com/?prefix=2019/01/19

Database of a subset of all 'events' that occur on this earth. Scraped from the internet I assume. https://www.gdeltproject.org/#intro Smaller 1.1gb version of the dataset http://data.gdeltproject.org/events/GDELT.MASTERREDUCEDV2.1979-2013.zip

Headers for 30gb taxi dataset http://www.debs2015.org/call-grand-challenge.html

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
commands		commands
deployments/slave		deployments/slave
examples		examples
modules		modules
.gitignore		.gitignore
.npmignore		.npmignore
LICENSE		LICENSE
README.md		README.md
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bach

Prerequisites

Installation

Install via npm

Install from source

Usage

The bachfile

Map Reduce

Stream Processing

Interesting Datasets

About

Releases

Packages

Languages

License

conorturner/bach

Folders and files

Latest commit

History

Repository files navigation

Bach

Prerequisites

Installation

Install via npm

Install from source

Usage

The bachfile

Map Reduce

Stream Processing

Interesting Datasets

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages