You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Jupyter Notebook is an application for creating and sharing computational documents. JupyterHub is a way of providing the Notebooks to multiple users. The benefit is that users gain easy interactive access to computational resources without need to install anything.
GA4GH TES (Task Execution Service) API is a standardized schema and API for describing and executing batch execution tasks on any underlying computational backend. Full TES spec defines TES capabilities.
The goal of this issue is to develop or to lay foundations to GA4GH TES service plugin for JupyterHub that would execute individual cells in the TES instance.
Objective: Build a plugin or extension within JupyterHub that allows seamless access to GA4GH TES, streamlining federated task submission. The plugin will focus on the goal of executing a single cell through TES
Scope: Focus on plugin development, installation instructions, and usage documentation so administrators can easily deploy it across ELIXIR nodes.
This is a larger meta issue that might (should) require discussions. Here are some helping points:
Considerations:
Core Components
TES Client Library: You'll need a client library in Python (the language Jupyter notebooks use) to interact with the TES instance. This library will handle:
Constructing TES task requests based on notebook cell content.
Submitting these tasks to the TES server.
Monitoring task execution status.
Retrieving results and outputs.
Notebook Integration: Develop a mechanism within the Jupyter notebook environment to:
Identify code cells to be executed on TES. (Perhaps a magic command like %%tes or a dedicated cell tag)
Extract code and dependencies from these cells.
Package them into a format suitable for TES (e.g., Docker image).
Display task status and results within the notebook.
Cell Identification: Use a magic command (e.g., %%tes) or a cell tag to mark cells for TES execution.
Dependency Management: Automatically detect or allow users to specify required packages for the code in the cell.
Task Creation: The client library will generate a TES task definition:
Inputs: Code from the cell, required data, and dependency specifications.
Container: A Docker image containing the execution environment (with necessary packages).
Command: The command to execute the code within the container.
Outputs: Specify where to store the results (files, object storage).
Submission and Monitoring: Submit the task to TES and provide visual feedback in the notebook (e.g., a progress bar, status updates).
Result Retrieval: Fetch outputs from TES and display them in the notebook (e.g., print output, display plots).
Implementation Considerations
Security: Securely handle authentication and authorization to the TES instance.
Scalability: Design for efficient execution of large notebooks with many cells and complex dependencies.
Usability: Provide a user-friendly interface within the notebook for TES interaction.
Flexibility: Support the option to choose from multiple TES instances and allow customization of task parameters.
Tools and Technologies
TES Implementations: Funnel, TESK, TES Azure
Python TES Client: py-tes
Docker: For containerization
Jupyter Extensions: To enhance the notebook interface
Example Workflow for individual cell execution
User adds %%tes to a code cell in their Jupyter notebook.
A "Run on TES" button appears next to the cell.
User clicks the button.
The client library packages the code, data, and dependencies into a Docker image.
A TES task is created and submitted.
The notebook displays the task status (e.g., "Queued," "Running," "Complete").
When the task finishes, the results are fetched from TES and displayed in the notebook.
If you want to work on this issue:
Assign yourself to the issue (if someone else is already assigned, first ask them if they would mind help on the issue - or pick another one)
Once assigned, move your issue to the "In progress" column on the project board
Start working 🚀
The text was updated successfully, but these errors were encountered:
Why?
Jupyter Notebook is an application for creating and sharing computational documents. JupyterHub is a way of providing the Notebooks to multiple users. The benefit is that users gain easy interactive access to computational resources without need to install anything.
GA4GH TES (Task Execution Service) API is a standardized schema and API for describing and executing batch execution tasks on any underlying computational backend. Full TES spec defines TES capabilities.
The goal of this issue is to develop or to lay foundations to GA4GH TES service plugin for JupyterHub that would execute individual cells in the TES instance.
Objective: Build a plugin or extension within JupyterHub that allows seamless access to GA4GH TES, streamlining federated task submission. The plugin will focus on the goal of executing a single cell through TES
Scope: Focus on plugin development, installation instructions, and usage documentation so administrators can easily deploy it across ELIXIR nodes.
More useful information and link: document online
How?
This is a larger meta issue that might (should) require discussions. Here are some helping points:
Considerations:
Example Workflow for individual cell execution
If you want to work on this issue:
The text was updated successfully, but these errors were encountered: