Downloading ChRIS files from swift 10x efficiently by bypassing CUBE #512
jennydaman
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
chrs
is the best way to download files from a production ChRIS deployment. However, in situations where we can access swift directly, we can sidestep CUBE to improve download speeds.On our internal deployment
cube-next
, downloading 80,000 files (70 GB) is 10x faster usingrclone
connecting directly to Swift thanchrs
which downloads from CUBE.Experiment Setup
We want to compare the performance of downloading from CUBE v.s. Swift directly. We're using
chrs
v0.2.3 as the CUBE client andrclone
v1.62.2 as the Swift client.chrs
is hard-coded to do 4 concurrent downloads so the--transfers=4
option is passed torclone
.First I ran miniChRIS-docker and used chrisomatic to add dbg-bigfiles. Then I configured rclone and chrs for connection:
I created some data in ChRIS by running dbg-bigfiles
The Benchmark
Results
chrs
rclone
chrs
rclone
Conclusion
Connecting to swift directly using
rclone
is more efficient than downloading files from CUBE usingchrs
.This is attributable to the inefficiency of CUBE (here's an experiment assessing the overhead of chrs as a client).
Inefficiency problems are exacerbated in situations where there are numerous small files.
Side Observation: CPU Time
chrs
spends more system time than user time, which is different fromrclone
. This has to do with howchrs
is written in async Rust using tokio.Limitations
In this experiment, client and server were the same computer. In reality, the client and server are usually different computers. Also, results depend a lot on the server's specs and load.
Beta Was this translation helpful? Give feedback.
All reactions