Spark Monitor Fork - A fork of SparkMonitor that works with multiple Spark Sessions

About

+

=

SparkMonitor is an extension for Jupyter Notebook that enables the live monitoring of Apache Spark Jobs spawned from a notebook. The extension provides several features to monitor and debug a Spark job from within the notebook interface itself.

Features

Automatically displays a live monitoring tool below cells that run Spark jobs in a Jupyter notebook
A table of jobs and stages with progressbars
A timeline which shows jobs, stages, and tasks
A graph showing number of active tasks & executor cores vs time
A notebook server extension that proxies the Spark UI and displays it in an iframe popup for more details
For a detailed list of features see the use case notebooks
How it Works

Build from source

npm version: 5.6.0 yarn version: 1.22.4 sbt version: 1.3.2

cd sparkmonitor/extension
#Build Javascript
yarn install # Only need to run the first time
yarn run webpack
#Build SparkListener Scala jar
cd scalalistener/
sbt package

Run Locally

docker build -t sparkmonitor .
docker run -it -p 8888:8888 sparkmonitor

Deploy New Version

cd sparkmonitor/extension
vi VERSION # bump version number
python setup.py sdist
twine upload --repository-url https://upload.pypi.org/legacy/ dist/*

If twine upload step fails, run rm -rf dist/*, bump the VERSION number, and rerun steps above.

Quick Installation

pip install sparkmonitor-s
jupyter nbextension install sparkmonitor --py --user --symlink 
jupyter nbextension enable sparkmonitor --py --user            
jupyter serverextension enable --py --user sparkmonitor
ipython profile create && echo "c.InteractiveShellApp.extensions.append('sparkmonitor.kernelextension')" >>  $(ipython profile locate default)/ipython_kernel_config.py

For more detailed instructions click here

Integration with ROOT and SWAN

At CERN, the SparkMonitor extension would find two main use cases:

Distributed analysis with ROOT and Apache Spark using the DistROOT module. Here is an example demonstrating this use case.
Integration with SWAN, A service for web based analysis, via a modified container image for SWAN user sessions.

Name		Name	Last commit message	Last commit date
Latest commit History 222 Commits
docs		docs
extension		extension
notebooks		notebooks
.dockerignore		.dockerignore
.gitignore		.gitignore
.travis.yml		.travis.yml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spark Monitor Fork - A fork of SparkMonitor that works with multiple Spark Sessions

About

Features

Build from source

Run Locally

Deploy New Version

Quick Installation

For more detailed instructions click here

Integration with ROOT and SWAN

About

Releases

Packages

Languages

License

Ben-Epstein/sparkmonitor

Folders and files

Latest commit

History

Repository files navigation

Spark Monitor Fork - A fork of SparkMonitor that works with multiple Spark Sessions

About

Features

Build from source

Run Locally

Deploy New Version

Quick Installation

For more detailed instructions click here

Integration with ROOT and SWAN

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages