Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we use tmpfile names or some other collision protection in upload_staging? #20

Open
phirstgemini opened this issue May 9, 2023 · 0 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@phirstgemini
Copy link
Collaborator

Unsure if this is a good plan. Right now if there's collision in the sense that the same file gets uploaded twice at the same time, the second one will overwrite the first one, just by virtue of open(filename. 'wb') on the same filename in upload_staging. When the first one completes it will add it to the ingest queue - that will either fail to ingest for a malformed fits file, or will keep deferring as the file is recently modified, etc. The second add to ingest queue will only happen if the first one has already started, so it should all be good in the end, though this seems like not the most robust and clear way to deal with this.

If we use unique tmpfile names in upload_staging, we're just deferring the collision until the point where we copy the file to dataflow or upload it to S3, which doesn't seem obviously better.

Maybe we could do some other kind of collision protection such that the second upload attempt will get rejected until the first one has completed. We could simply check for the existence of filename in upload_staging and reject the attempt if it's already there, though then we'd need to be very careful about ensuring we clean-up the partial file if an upload fails as we wouldn't be able to simply re-try and end up overwriting the failed one in that case.

@phirstgemini phirstgemini added the enhancement New feature or request label May 9, 2023
@phirstgemini phirstgemini self-assigned this May 9, 2023
@phirstgemini phirstgemini added this to the 3.3 release milestone Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant