-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add docs framework for how to clone a new DANDI instance #104
base: master
Are you sure you want to change the base?
Conversation
Build out docs for Linc clone
Include more specific deployment token
Trivial env change
@aaronkanzer Why do the instructions say to create an account on PyPI? That should only be done if you're planning to release packages on PyPI, which has nothing to do with interacting with DANDI. |
Hi @jwodder, These instructions are meant for the developers of the data archive and associated tools, especially for developing a new DANDI-like ecosystem which we are doing for the LINC project. Since the DANDI CLI and Python API are a method of interacting with the archive, Aaron added instructions here for releasing the Python package to PyPI. Hope this helps to answer your question. |
Hi @jwodder, we are releasing a clone of the |
Hi @aaronkanzer , should we strive to finalize this PR to some form and merge? That
etc |
Thanks @yarikoptic -- yes, it would be great to get review outside of my own words -- I am currently in the process of some cleanup here; however, review would be helpful. It is still quite a living document as there are more and more things that I think @kabilar and I are slowly abstracting -- let me know if you'd like to Zoom to discuss how we could make the handbook beneficial here for EMBER deployment in the short-term |
Initialize Vendor Accounts
page
@kabilar @yarikoptic @satra @asmacdo @jwodder @waxlamp @jjnesbitt @mvandenburgh Hi all, I'd like to start the review process (and get opinions on what is unclear/missing) for this PR. This PR is a brain-dump of essentially "how to clone DANDI" in its current state To perhaps make the PR much more approachable, I tagged specific users at the top of given pages that no one is required to review the entire PR. If you'd like to visualize the docs live, I've launched a temp. Netlify site -- https://aquamarine-profiterole-e20e84.netlify.app/59_getting_started_replicating_dandi/ Thanks all in advance |
@@ -0,0 +1,6 @@ | |||
The DANDI ecosystem includes a self-hosted Jupyter notebook service. This service is orchestrated on a Kubernetes (k8s) cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asmacdo would be great if you could review
For reference to easily read in staging setting: https://aquamarine-profiterole-e20e84.netlify.app/65_dandi_hub/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With @kabilar 's suggestions, LGTM
@@ -0,0 +1,83 @@ | |||
# Work In Progress | |||
|
|||
## Setting up your GitHub OAuth Account |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@waxlamp @jjnesbitt @mvandenburgh would be great if you could review
For reference to easily read in staging setting: https://aquamarine-profiterole-e20e84.netlify.app/61_dandi_authentication/
@@ -0,0 +1,448 @@ | |||
# Initialize Vendor Accounts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@waxlamp @yarikoptic @satra @kabilar would be great if you could review
For reference to easily read in staging setting: https://aquamarine-profiterole-e20e84.netlify.app/60_initialize_vendors/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs Dockerhub (or some other container registry)?
@@ -0,0 +1,48 @@ | |||
For data management (predominately `upload`, `download` and `validation` of data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jwodder @yarikoptic would be great if you could review
For reference to easily read in staging setting: https://aquamarine-profiterole-e20e84.netlify.app/62_dandi_cli/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Aaron. Just noting that given your recent developments to push lincbrain-cli
changes upstream to the dandi-cli
in dandi/dandi-cli#1519, we will need to update these instructions.
@@ -0,0 +1,185 @@ | |||
# Work In Progress | |||
|
|||
## Configuring Terraform |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@waxlamp @jjnesbitt @mvandenburgh would be great if you could review
For reference to easily read in staging setting: https://aquamarine-profiterole-e20e84.netlify.app/63_dandi_infrastructure/
@@ -0,0 +1,203 @@ | |||
This step assumes that you have completed all steps in: [Initialize Vendors](../60_initialize_vendors) & [DANDI Infrastructure](../63_dandi_infrastructure) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@waxlamp @jjnesbitt @mvandenburgh @yarikoptic @satra @kabilar would be great if you could review
For reference to easily read in staging setting: https://aquamarine-profiterole-e20e84.netlify.app/64_dandi_archive/
The DANDI ecosystem includes a self-hosted Jupyter notebook service. This service is orchestrated on a Kubernetes (k8s) cluster | ||
that provides different instance types of users to efficiently interact with data in the DANDI Archive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DANDI ecosystem includes a self-hosted Jupyter notebook service. This service is orchestrated on a Kubernetes (k8s) cluster | |
that provides different instance types of users to efficiently interact with data in the DANDI Archive. | |
The DANDI ecosystem includes a self-hosted Jupyter notebook service. This service is hosted on AWS and orchestrated with a Kubernetes (k8s) cluster | |
that provides different instance types for users to efficiently interact with data in the DANDI Archive. |
[Proceed to the following README](https://github.com/dandi/dandi-hub/blob/main/README.md#dandihub) to see how you can | ||
set up your own DANDI Hub -- **Note: it is important that your k8s cluster is in the same region | ||
as your data** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Proceed to the following README](https://github.com/dandi/dandi-hub/blob/main/README.md#dandihub) to see how you can | |
set up your own DANDI Hub -- **Note: it is important that your k8s cluster is in the same region | |
as your data** | |
The instructions for configuring and deploying your own JupyterHub instance are available in the [dandi-hub repository](https://github.com/dandi/dandi-hub) (see [README](https://github.com/dandi/dandi-hub/blob/main/README.md#dandihub)). | |
For example configurations that have been previously generated for the DANDI, LINC, and BICAN projects see the [envs directory](https://github.com/dandi/dandi-hub/tree/main/envs). | |
**Note: it is important that your k8s cluster is in the same region as your data.** |
Thank you, Aaron. This is great. I will be reviewing over the next week and slowly adding suggestions. |
|
||
[Proceed to the following README](https://github.com/dandi/dandi-hub/blob/main/README.md#dandihub) to see how you can | ||
set up your own DANDI Hub -- **Note: it is important that your k8s cluster is in the same region | ||
as your data** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to your Google Doc, perhaps we can add some quick links to each page.
as your data** | |
as your data** | |
Resources | |
1. [Source code and instructions]( https://github.com/dandi/dandi-hub) | |
1. [DANDI Hub](https://hub.dandiarchive.org/) | |
1. [LINC Hub](https://hub.lincbrain.org/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Kabi -- added.
docs/60_initialize_vendors.md
Outdated
• **Datalad (TBD)** | ||
|
||
• **git-annex (TBD)** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are just tools -- no account needed and overall they just rely on above services (GitHub) and git configuration to run.
But for completeness -- we do need a host (in DANDI case it is drogon
server) with an account under which to run all those additional "cron jobs", hence
• **Datalad (TBD)** | |
• **git-annex (TBD)** | |
In addition a host (local server or an instance in cloud) is needed to run additional services employing DataLad and git-annex. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are cron-jobs part of the infrastructure? or should be considered essential?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not have definition of "infrastructure" to define the boundary.
if we consider https://github.com/dandisets etc as part of infrastructure, then yes.
as "essential" -- likely not as long as not integrated within dandiarchive.org web UI.
## datalad (TBD) | ||
|
||
## git-annex (TBD) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yarikoptic needs to finish up but likely elsewhere and here just describe setup of the box:
## datalad (TBD) | |
## git-annex (TBD) | |
## A host for extra services | |
Some services are not yet integrated within the main infrastructure: | |
- https://github.com/dandi/backups2datalad - to populate/update https://github.com/dandi/dandisets, https://github.com/dandisets, and https://github.com/dandizarrs/ | |
- TODO: heroku logs | |
- TODO: aws s3 access stats dump | |
- TODO: con/tinuous dumps of CI logs | |
- TODO: zarr manifests generation (ATM not on drogon even) | |
- TODO: access stats analysis/plots (yet to be finished/cron deployed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Yarik -- updated
@satra @waxlamp @jwodder @jjnesbitt @mvandenburgh @asmacdo -- just wanted to bump this, any chance a quick read-through, feedback could occur? The outcomes of this handbook will help inform what we automate/abstract into infra-as-code vs. what remains manual, thus any feedback is greatly appreciated |
docs/60_initialize_vendors.md
Outdated
• **Heroku** | ||
|
||
• **AWS** | ||
|
||
• **GitHub** | ||
|
||
• **Terraform Cloud** | ||
|
||
• **Netlify** | ||
|
||
• **Sentry** | ||
|
||
• **PyPI** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it would be good to add what each of these services provide to the infrastructure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Satra -- included a brief blurb for what each service is responsible for
style="width: 60%; height: auto; display: block; margin-left: auto; margin-right: auto;"/> | ||
<br/><br/> | ||
|
||
Keep this value for further steps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be valuable to know what sizes of instances we have for DANDI and LINC to guide installations of other instances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure -- I can add, they are defined in the DANDI Infra api.tf
Girder extension here: https://github.com/dandi/dandi-infrastructure/blob/master/terraform/api.tf#L14-L18
On this note, has any stress-testing ever been done to evaluate if these worker sizes are appropriate or not for DANDI?
style="width: 60%; height: auto; display: block; margin-left: auto; margin-right: auto;"/> | ||
<br/><br/> | ||
|
||
Your frontend should be able to deploy to an auto-generated URL via Netlify now! Steps for domain management and configuration are described further in the [Frontend Deployment](../64_dandi_archive/#frontend-deployment) section of these within the DANDI Archive setup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question came up on knowing how many minutes is needed by netlify for DANDI instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you say minutes
in this context for Netlify, do you mean "build minutes"? (e.g. how long Netlify runners are required to run to deploy?) Or something else?
Looking for review for now, no need to merge
These documents provide a step-by-step process if another user would like to launch their own Dandi-like ecosystem
please see here if you'd like to observe a live link: https://aquamarine-profiterole-e20e84.netlify.app/
or specifically:
https://lincbrain.github.io/handbook/40_initialization/