OPS

OPtimized Shuffle is a distributed shuffle management system which focuses on optimizing the shuffle phase of DAG computing frameworks. The name OPS is originated from the ancient Roman religion, she was a fertility deity and earth goddess of Sabine origin.

OPS

System Overview

Architecture

Figure 1: OPS Architecture

Internal View

Figure 2: Internal View Comparison between Legacy Hadoop and Hadoop with OPS

LifeCycle

Figure 3: Lifecyle of OPS

Structure

Figure 4: OPS Structure

ETCD File Structure

ops/
- jobs/
  - job-${jobId}: JobConf
- mapTaskFirstAlloc/
  - mapTaskFirstAlloc-${jobId}: MapTaskAlloc
- mapTaskSecondAlloc/
  - mapTaskSecondAlloc-${jobId}: MapTaskAlloc
- reduceTaskAlloc/
  - reduceTaskAlloc-${jobId}: ReduceTaskAlloc
- shuffle/
  - reduceNum/
    - reduceNum-${nodeIp}-${jobId}-${reduceId}: num
  - mapCompleted/
    - mapCompleted-${nodeIp}-${jobId}-${mapId}: MapReport
  - shuffleCompleted/
    - shuffleCompleted-${dstNodeIp}-${jobId}-${num}-${mapId}: path
  - indexRecords/
    - indexRecords-${nodeIp}-${jobId}-${mapId}: CollectionConf
- tasks/
  - reduceTasks/
    - reduceTask-${nodeIp}-${jobId}-${reduceId}: ReduceConf

Compatibility

In order to use OPS, the modification on DAG computing frameworks is inevitable. For now, we only implement OPS on Hadoop MapReduce. Our customized Hadoop is available in here. We believe that the costs of enabling OPS on other DAG computing frameworks are also very low.

Build

$ mvn clean install

Run

OPS: Optimized Shuffle

Usage:
  ops.sh [command]

Commands:
  master         OpsMaster
  worker         OpsWorker

Options:
  -h, --help     Show usage

Use ops.sh [command] --help for more information about a command.

License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 171 Commits
fig		fig
scripts		scripts
src/main		src/main
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OPS

System Overview

Architecture

Internal View

LifeCycle

Structure

ETCD File Structure

Compatibility

Build

Run

License

About

Releases

Packages

Contributors 3

Languages

License

sjtu-sail/ops-hadoop

Folders and files

Latest commit

History

Repository files navigation

OPS

System Overview

Architecture

Internal View

LifeCycle

Structure

ETCD File Structure

Compatibility

Build

Run

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages