-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a "non-data" full release pipeline (ontology metadata curation only) #382
Comments
Current discussion is looking at:
|
Tagging @pgaudet |
Considering doing a full "raw data" release, including Zenodo and a CF endpoint. This may actually be easiest as it mirrors what we already do--basically the first stage of snapshot, plus the second stage's "publish" step. I'll want to check size and whether Zenodo can digest; we want this fully automated and on smooth rails. Maybe skip Zenodo, as we still will have the full release there. |
Basing around "raw-data" |
…files are now the same (empty) and it seems to be tripping up for some reason; for geneontology/pipeline#382
I'm doing some exploring of a partial run. Looking at what I have, I expect that all raw upstreams and first-order products (excluding blazegraph and solr), to run about 10G. This puts us well under typical limits for Zenodo and our usual publications (which clock in at nearly 50G). If working weekly, this would allow us to use a monthly buffer (or/with S3 lifecycle) or Zenodo as transport without incurring too much overhead or cost. |
…iate targets and flush for raw-data.geneontology.org; for #382
From talking to @pgaudet, I think I'll move raw-data.geneontology.org a little closer to where we want it to be by removing "annotations/" and "blazegraph/". |
TBD: after talking to Alex, the best way to package and communicate our data for remote processing and re-ingest. |
In order to support the joint pipeline with GOA, we want to create a high-frequency high-success pipeline that produces all of the data products that GOA needs to complete their parts of the pipeline.
We want to produce:
In such a way as to enable easy pickup and signalling for GOA.
The text was updated successfully, but these errors were encountered: