-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add example for Reclient. #286
Add example for Reclient. #286
Conversation
14b957d
to
09510bf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this!
Reviewed 7 of 7 files at r1, all commit messages.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @chrisstaite-menlo)
deployment-examples/docker-compose-reclient/docker-compose.yml
line 19 at r1 (raw file):
services: turbo_cache_local_cas: image: allada/turbo-cache:latest
nit: can we use "extends" similar to how it's done here?
https://github.com/TraceMachina/turbo-cache/pull/267/files#diff-419ec2b388f508c75e39af538e03dd9959467d6bea1bf45b8fcda943687d27b3R19
deployment-examples/docker-compose-reclient/docker-compose.yml
line 67 at r1 (raw file):
target: /var/run/docker.sock environment: RUST_LOG: info
fyi: this will certainly slow everything down due to the amount of logs being generated. I would suggest warn
.
It might be worth for me to spend time evaluating what is "info" and what is "debug", right now they are mixed.
deployment-examples/docker-compose-reclient/README.md
line 11 at r1 (raw file):
## Roles There a number of roles required to perform a Reclient set up:
nit: There are
deployment-examples/docker-compose-reclient/README.md
line 14 at r1 (raw file):
- Turbo Cache Scheduler - The Turbo Cache instance that communicates with the Goma proxy and handles scheduling of tasks to workers within the Turbo Cache
Doesn't REclient replace goma?
deployment-examples/docker-compose-reclient/README.md
line 39 at r1 (raw file):
# Required because the default is projects/rbe-chrome-untrusted/instances/default_instance # and we do not support instance names with / in currently. export RBE_instance=main
nit: I'd remove this as it's about to be fixed.
deployment-examples/docker-compose-reclient/README.md
line 52 at r1 (raw file):
Now that the environment is set up you simply need to enable it for your Chromium build by modifying your `args.gn` file to contain:
note: You can also use:
gn gen --args="use_goma=true rbe_cfg_dir=\"../../buildtools/reclient_cfgs/linux\"" out/Default
deployment-examples/docker-compose-reclient/run_in_container.sh
line 13 at r1 (raw file):
WORK_DIRECTORY=$(pwd | cut -c25-94) WORKING_DIRECTORY=$(pwd | cut -c95-) exec docker run --rm -w /work${WORKING_DIRECTORY} -v ${HOST_ROOT}${WORK_DIRECTORY}:/work $(echo "$CONTAINER" | cut -c10-) /bin/env "$@"
oh the memories.
It turns out this has a high chance of causing spurious errors (from experience). We got around these random docker failures by running a script in docker that would collect the exit code and write it to a file then outside docker read the file. If the file was not written too, it'd kill/restart the worker and inform the scheduler to disconnect all future work (because it means docker, kernel or other system failure if no exit code).
no action, but FYI.
deployment-examples/docker-compose-reclient/run_in_container.sh
line 13 at r1 (raw file):
WORK_DIRECTORY=$(pwd | cut -c25-94) WORKING_DIRECTORY=$(pwd | cut -c95-) exec docker run --rm -w /work${WORKING_DIRECTORY} -v ${HOST_ROOT}${WORK_DIRECTORY}:/work $(echo "$CONTAINER" | cut -c10-) /bin/env "$@"
nit: Shouldn't it be cut -c9-
?
deployment-examples/docker-compose-reclient/worker.json
line 55 at r1 (raw file):
}, // Need to specify a placeholder here otherwise Priority scheduling does // not work.
hmmm, is this a bug? Shouldn't we just be able to pass an empty array?
If not can we file a ticket and reference it?
deployment-examples/docker-compose-reclient/worker_precondition_script.sh
line 3 at r1 (raw file):
#!/bin/sh set -eu
nit: Can we add a comment here stating when and how this file runs?
deployment-examples/docker-compose-reclient/worker_precondition_script.sh
line 5 at r1 (raw file):
AVAILABLE_MEMORY_KB=$(awk '/MemAvailable/ { printf "%d \n", $2 }' /proc/meminfo); # At least 2Gb of RAM currently available
nit: period.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @allada)
deployment-examples/docker-compose-reclient/docker-compose.yml
line 67 at r1 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
fyi: this will certainly slow everything down due to the amount of logs being generated. I would suggest
warn
.It might be worth for me to spend time evaluating what is "info" and what is "debug", right now they are mixed.
Yeah, this is a lay-over from debugging that I forgot to remove.
deployment-examples/docker-compose-reclient/README.md
line 14 at r1 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
Doesn't REclient replace goma?
Can you tell this was copy-amended?
deployment-examples/docker-compose-reclient/README.md
line 39 at r1 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
nit: I'd remove this as it's about to be fixed.
It was really just a reminder to me to remove it once it is fixed.
deployment-examples/docker-compose-reclient/run_in_container.sh
line 13 at r1 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
oh the memories.
It turns out this has a high chance of causing spurious errors (from experience). We got around these random docker failures by running a script in docker that would collect the exit code and write it to a file then outside docker read the file. If the file was not written too, it'd kill/restart the worker and inform the scheduler to disconnect all future work (because it means docker, kernel or other system failure if no exit code).
no action, but FYI.
Yeah, I've been there too... I hope this doesn't come back to bite me
deployment-examples/docker-compose-reclient/run_in_container.sh
line 13 at r1 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
nit: Shouldn't it be
cut -c9-
?
I don't think so, it appears to be working fine for me.
deployment-examples/docker-compose-reclient/worker.json
line 55 at r1 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
hmmm, is this a bug? Shouldn't we just be able to pass an empty array?
If not can we file a ticket and reference it?
Yeah, I wasn't sure to be honest how it was supposed to work.
Did you forget to push the changes? It looks nearly ready to land. |
Mostly I've been busy sorting other stuff to get this over the line. But also, there's a few glitches with set-up that I wanted to iron out so the README just worked out of the box that I haven't had time to finalise yet. |
47b96f3
to
980fd05
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the README such that it actually works now.
Reviewable status: 5 of 7 files reviewed, 4 unresolved discussions (waiting on @allada)
deployment-examples/docker-compose-reclient/README.md
line 52 at r1 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
note: You can also use:
gn gen --args="use_goma=true rbe_cfg_dir=\"../../buildtools/reclient_cfgs/linux\"" out/Default
Done.
980fd05
to
fa6bfc0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is ready now, but I've not tested it in-place. I'd like someone separately to run through the steps in the README and validate it actually works. Could you do that @allada ?
Reviewable status: 3 of 7 files reviewed, 3 unresolved discussions (waiting on @allada)
Is it ok to defer this and have @blakehatch or @aaronmondal do it? They are going to be ramping up over the next week and this would even be better to have completely fresh eyes on it. |
fa6bfc0
to
8d31bad
Compare
Apparently |
90dc9c1
to
3252364
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 4 files at r3, 1 of 2 files at r4.
Reviewable status: 5 of 7 files reviewed, all discussions resolved (waiting on @chrisstaite-menlo)
67f3869
to
790d785
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 7 files at r1, 2 of 4 files at r3, 2 of 2 files at r4, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @chrisstaite-menlo)
Little bump on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about +@adam-singer why don't you take a look at this first, then I'll do one final review. Hopefully this will help you get some knowledge on deployment examples?
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @adam-singer)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @chrisstaite-menlo)
deployment-examples/docker-compose-reclient/docker-compose.yml
line 1 at r4 (raw file):
# Copyright 2023 The Turbo Cache Authors. All rights reserved.
Mentioned a few times below, the project name has recently changed, could we apply those changes to this patch "Turbo Cache" -> "Native Link"
deployment-examples/docker-compose-reclient/docker-compose.yml
line 19 at r4 (raw file):
services: turbo_cache_local_cas: image: allada/turbo-cache:latest
@allada do we have a different/canonical location for this image?
deployment-examples/docker-compose-reclient/README.md
line 13 at r4 (raw file):
There are a number of roles required to perform a Reclient set up: - Turbo Cache Scheduler - The Turbo Cache instance that handles all of the
The project has been renamed to "Native Link", lets word towards that
deployment-examples/docker-compose-reclient/README.md
line 49 at r4 (raw file):
fail_early_min_fallback_ratio=0.5 deps_cache_max_mb=256 # TODO(b/276727504) Re-enable once noop build shutdown time bug is fixed
This looks like an internal bug link, can we remove or specific a bit more what the bug is related to?
deployment-examples/docker-compose-reclient/run_in_container.sh
line 15 at r4 (raw file):
# on the host and we need to account for that. This strips off the # /root/.cache/turbo-cache and that is populated with $HOST_ROOT from the # worker.json which is then populated by ${TURBO_CACHE_DIR} from the
TURBO_CACHE_DIR
-> NATIVE_LINK_DIR
deployment-examples/docker-compose-reclient/run_in_container.sh
line 17 at r4 (raw file):
# worker.json which is then populated by ${TURBO_CACHE_DIR} from the # docker-compose.yml. WORK_DIRECTORY=$(pwd | cut -c25-94)
for the pipes to cut
what kind of assumptions are those character ranges making? It is unclear what they might be
deployment-examples/docker-compose-reclient/worker.json
line 13 at r4 (raw file):
"grpc": { "instance_name": "main", "endpoints": ["grpc://${CAS_ENDPOINT:-127.0.0.1}:50051"],
Both environment variables are CAS_ENDPOINT
, is there a situation where we would have a difference between these endpoints and their logical hosts?
deployment-examples/docker-compose-reclient/worker.json
line 21 at r4 (raw file):
"fast": { "filesystem": { "content_path": "/root/.cache/turbo-cache/data-worker-test/content_path-cas",
turbo-cache
-> native-link
, same below
deployment-examples/docker-compose-reclient/worker.json
line 45 at r4 (raw file):
"additional_environment": { "CONTAINER": {"Property": "container-image"}, "HOST_ROOT": {"Value": "${TURBO_CACHE_DIR}"},
TURBO_CACHE_DIR
-> NATIVE_LINK_DIR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @adam-singer and @chrisstaite-menlo)
deployment-examples/docker-compose-reclient/docker-compose.yml
line 19 at r4 (raw file):
Previously, adam-singer (Adam Singer) wrote…
@allada do we have a different/canonical location for this image?
Yeah this is now TraceMachina/native-link
Previously, allada (Nathan (Blaise) Bruer) wrote…
FYI: This isn't updated in docker-compose/docker-compose.yml. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @adam-singer)
deployment-examples/docker-compose-reclient/docker-compose.yml
line 1 at r4 (raw file):
Previously, adam-singer (Adam Singer) wrote…
Mentioned a few times below, the project name has recently changed, could we apply those changes to this patch "Turbo Cache" -> "Native Link"
Done.
deployment-examples/docker-compose-reclient/README.md
line 13 at r4 (raw file):
Previously, adam-singer (Adam Singer) wrote…
The project has been renamed to "Native Link", lets word towards that
Done.
deployment-examples/docker-compose-reclient/README.md
line 49 at r4 (raw file):
Previously, adam-singer (Adam Singer) wrote…
This looks like an internal bug link, can we remove or specific a bit more what the bug is related to?
This is a Chromium bug link, those that work on Chromium will understand and there's no reason to be using this if you don't work on Chromium.
deployment-examples/docker-compose-reclient/run_in_container.sh
line 15 at r4 (raw file):
Previously, adam-singer (Adam Singer) wrote…
TURBO_CACHE_DIR
->NATIVE_LINK_DIR
Done.
deployment-examples/docker-compose-reclient/run_in_container.sh
line 17 at r4 (raw file):
Previously, adam-singer (Adam Singer) wrote…
for the pipes to
cut
what kind of assumptions are those character ranges making? It is unclear what they might be
There's a whole comment explaining that above.
deployment-examples/docker-compose-reclient/worker.json
line 13 at r4 (raw file):
Previously, adam-singer (Adam Singer) wrote…
Both environment variables are
CAS_ENDPOINT
, is there a situation where we would have a difference between these endpoints and their logical hosts?
Yes, but this is simply an example deployment, one wouldn't usually run the worker on the same node as the scheduler, build and CAS.
deployment-examples/docker-compose-reclient/worker.json
line 21 at r4 (raw file):
Previously, adam-singer (Adam Singer) wrote…
turbo-cache
->native-link
, same below
Done.
deployment-examples/docker-compose-reclient/worker.json
line 45 at r4 (raw file):
Previously, adam-singer (Adam Singer) wrote…
TURBO_CACHE_DIR
->NATIVE_LINK_DIR
Done.
6754a39
to
c1edb16
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @allada and @chrisstaite-menlo)
deployment-examples/docker-compose-reclient/docker-compose.yml
line 19 at r4 (raw file):
Previously, chrisstaite-menlo (Chris Staite) wrote…
FYI: This isn't updated in docker-compose/docker-compose.yml.
@chrisstaite-menlo looks like we will have a few more days of breaking changes, do you mind if we leave this open and revisit it next week for landing. At that point we should have stabilized image and binary command name. Thank you for this and patients around breaking changes
@adam-singer and @allada I think we can look back into getting this working again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: publish-image, windows-2022 / beta (waiting on @adam-singer)
deployment-examples/docker-compose-reclient/docker-compose.yml
line 19 at r4 (raw file):
Previously, adam-singer (Adam Singer) wrote…
@chrisstaite-menlo looks like we will have a few more days of breaking changes, do you mind if we leave this open and revisit it next week for landing. At that point we should have stabilized image and binary command name. Thank you for this and patients around breaking changes
This image is now now be ghcr.io/tracemachina/nativelink:wm6l3qi72hv9phn44853ir1dxs3q7k65
instead of tracemachina/nativelink:latest
. I think it's a drop-in replacement, but I'm not 100% sure. Might need some adjustments to mount paths. The new image already has the cas
executable set as entrypoint, so the command to run should just be the path to the config file, i.e.
command: /root/local-storage-cas.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MarcusSorealheis agree.
@chrisstaite-menlo we are ready to review this, pointed out some breaking changes since 0.1.0
release, also will likely need to revert Cargo.lock file.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: publish-image, windows-2022 / beta
deployment-examples/docker-compose-reclient/scheduler.json
line 7 at r6 (raw file):
"instance_name": "main", "endpoints": ["grpc://${CAS_ENDPOINT:-127.0.0.1}:50051"], "store_type": "CAS"
CAS
-> cas
deployment-examples/docker-compose-reclient/scheduler.json
line 14 at r6 (raw file):
"instance_name": "main", "endpoints": ["grpc://${CAS_ENDPOINT:-127.0.0.1}:50051"], "store_type": "AC"
AC
-> ac
deployment-examples/docker-compose-reclient/scheduler.json
line 28 at r6 (raw file):
"simple": { "supported_platform_properties": { "cpu_count": "Minimum",
Minimum
-> minimum
deployment-examples/docker-compose-reclient/scheduler.json
line 29 at r6 (raw file):
"supported_platform_properties": { "cpu_count": "Minimum", "container-image": "Priority"
Priority
-> priority
deployment-examples/docker-compose-reclient/scheduler.json
line 72 at r6 (raw file):
} }, { "listen_address": "0.0.0.0:50061",
This has changed to the following format https://github.com/adam-singer/native-link/blob/d8239640a9fa26c932a4c234ee2d263837159388/nativelink-config/examples/basic_cas.json#L131
"listener": {
"http": {
"socket_address": "0.0.0.0:50061"
}
},
deployment-examples/docker-compose-reclient/scheduler.json
line 81 at r6 (raw file):
"scheduler": "MAIN_SCHEDULER", }, "prometheus": {
prometheus
-> experimental_prometheus
92f025a
to
583effb
Compare
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
583effb
to
d79d2ba
Compare
d79d2ba
to
082a1ed
Compare
082a1ed
to
fd1072d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Analyze (python), Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Local / ubuntu-22.04, Remote / large-ubuntu-22.04, Vercel, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, pre-commit-checks, publish-image, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, vale, windows-2022 / stable, zig-cc ubuntu-20.04, zig-cc ubuntu-22.04 (waiting on @adam-singer)
deployment-examples/docker-compose-reclient/scheduler.json
line 7 at r6 (raw file):
Previously, adam-singer (Adam Singer) wrote…
CAS
->cas
Done.
deployment-examples/docker-compose-reclient/scheduler.json
line 28 at r6 (raw file):
Previously, adam-singer (Adam Singer) wrote…
Minimum
->minimum
Done.
deployment-examples/docker-compose-reclient/scheduler.json
line 29 at r6 (raw file):
Previously, adam-singer (Adam Singer) wrote…
Priority
->priority
Done.
deployment-examples/docker-compose-reclient/scheduler.json
line 72 at r6 (raw file):
Previously, adam-singer (Adam Singer) wrote…
This has changed to the following format https://github.com/adam-singer/native-link/blob/d8239640a9fa26c932a4c234ee2d263837159388/nativelink-config/examples/basic_cas.json#L131
"listener": { "http": { "socket_address": "0.0.0.0:50061" } },
Done.
deployment-examples/docker-compose-reclient/scheduler.json
line 81 at r6 (raw file):
Previously, adam-singer (Adam Singer) wrote…
prometheus
->experimental_prometheus
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Analyze (python), Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Local / ubuntu-22.04, Remote / large-ubuntu-22.04, Vercel, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, pre-commit-checks, publish-image, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, vale, windows-2022 / stable, zig-cc ubuntu-20.04, zig-cc ubuntu-22.04 (waiting on @adam-singer)
deployment-examples/docker-compose-reclient/scheduler.json
line 14 at r6 (raw file):
Previously, adam-singer (Adam Singer) wrote…
AC
->ac
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r7, 7 of 8 files at r8, all commit messages.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: pre-commit-checks, vale
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chrisstaite-menlo was able to test this patch against latest main with the following adjustments to the worker.json
https://gist.github.com/adam-singer/a7aa64abf6445eb5752f418a457ca25a#file-gistfile0-txt-L28 (breaking changes from previous release)
Followed instructions in the read me, but instead of creating / updating the reproxy.cfg
used the following cli to forward all reclient parameters required
RBE_reclient_timeout=60m RBE_exec_timeout=4m RBE_instance=main RBE_service=127.0.0.1:50052 RBE_alsologtostderr=true RBE_service_no_security=true RBE_service_no_auth=true RBE_local_resource_fraction=0.00001 RBE_automatic_auth=false RBE_gcert_refresh_timeout=20 RBE_compression_threshold=-1 RBE_metrics_namespace=main RBE_platform= RBE_experimental_credentials_helper= RBE_experimental_credentials_helper_args= autoninja -v -j 1 -C out/Default unit_tests
Checked out the chromium codebase from a client machine using the following instructions https://chromium.googlesource.com/chromium/src/+/main/docs/linux/build_instructions.md
Can you apply the changes and land this?
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: pre-commit-checks, vale
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks @chrisstaite-menlo, I am going to try this in a day or so and double check everything works.
We are trying to be a lot more diligent on documentation (due to feedback from users). This quarter we are planning on making compiling chrome our base case, meaning we are going to build a bit of CI and tests and such around this PR that we are going to use as a large integration & performance test.
Thanks again 😄
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: pre-commit-checks, vale (waiting on @adam-singer)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: pre-commit-checks, vale (waiting on @adam-singer)
a discussion (no related file):
Note: I suggest reading all the comments before any actions are done.
Second: Could you rebase, there's some breaking changes.
Last: I did test it and although it was not easy to setup, I did get it to work. We are going to be doing a lot of work here over the next couple months so assume lots of changes (hopefully to make your life better too).
Thanks again, awesome as usual 😄
deployment-examples/docker-compose-reclient/docker-compose.yml
line 1 at r8 (raw file):
# Copyright 2023 The Native Link Authors. All rights reserved.
nit: NativeLink
deployment-examples/docker-compose-reclient/README.md
line 13 at r8 (raw file):
There are a number of roles required to perform a Reclient set up: - Native Link Scheduler - The Native Link instance that handles all of the
nit: NativeLink
(and others in this file)
deployment-examples/docker-compose-reclient/README.md
line 28 at r8 (raw file):
## Running To start Native Link simply execute `docker-compose up` from this directory.
Lets add a section telling users to follow the instructions here on how to install and setup chrome:
https://chromium.googlesource.com/chromium/src/+/main/docs/linux/build_instructions.md
Lets also state that we will assume the user's checked out chrome to ~/chromium/src
This way when we talk about things that need to be in ~/chromium
so we can explicitly mark it as well as assume we are building on linux.
deployment-examples/docker-compose-reclient/README.md
line 32 at r8 (raw file):
port 50052. Create the file buildtools/reclient_cfgs/reproxy.cfg with the following:
nit: Please mention this change needs to happen in chrome's checkout.
deployment-examples/docker-compose-reclient/README.md
line 32 at r8 (raw file):
port 50052. Create the file buildtools/reclient_cfgs/reproxy.cfg with the following:
It think we should have users modify reproxy.cfg.template
instead? This way gclient sync
or gclient runhooks
does the work for us?
deployment-examples/docker-compose-reclient/README.md
line 34 at r8 (raw file):
Create the file buildtools/reclient_cfgs/reproxy.cfg with the following:instance=main
I suggest modifying template (as stated above), then:
instance=$rbe_instance
service=127.0.0.1:50052
log_format=reducedtext
$auth_flags
fail_early_min_action_count=4000
fail_early_min_fallback_ratio=0.5
deps_cache_max_mb=256
# TODO(b/276727504) Re-enable once noop build shutdown time bug is fixed
# enable_deps_cache=true
async_reproxy_termination=true
use_unified_uploads=true
fast_log_collection=true
depsscanner_address=$depsscanner_address
# Improve upload/download concurrency
max_concurrent_streams_per_conn=50
max_concurrent_requests_per_conn=50
min_grpc_connections=50
cas_concurrency=1000
# TODO(allada) Native Link doesn't yet support compressed blob upload
compression_threshold=-1
use_batches=false
# Metric metadata
metrics_namespace=$rbe_project
# Disables both TLS and authentication
service_no_security=true
# Required to stop autoninja from complaining about authentication despite being
# implied by service_no_security
service_no_auth=true
Then in the readme ask people to modify ~/chromium/.gclient
with something like:
solutions = [
{
"name": "src",
"url": "https://chromium.googlesource.com/chromium/src.git",
"managed": False,
"custom_deps": {},
"custom_vars": {
"rbe_instance": "main",
"rbe_project": "main",
"depsscanner_address": "exec:///home/allada/chromium/src/buildtools/reclient/scandeps_server",
},
},
]
I'm going to see about pushing an upstream chrome change to make remotebuildexecution.googleapis.com:443
a variable too. This would eventually streamline everything once we enable a few more features.
deployment-examples/docker-compose-reclient/README.md
line 65 at r8 (raw file):
Now that the environment is set up you simply need to enable it for your Chromium build by modifying your `args.gn` file to contain:
Maybe suggest calling gn args out/Default
and pasting it in there instead?
deployment-examples/docker-compose-reclient/worker.json
line 41 at r8 (raw file):
} }, "workers": [{
nit corrected for latest main changes:
"workers": [{
"local": {
"worker_api_endpoint": {
"uri": "grpc://${SCHEDULER_ENDPOINT:-127.0.0.1}:50061",
},
"entrypoint": "/root/run_in_container.sh",
"additional_environment": {
"CONTAINER": {
"property": "container-image"
},
"HOST_ROOT": {
"value": "${NATIVE_LINK_DIR}"
}
},
"cas_fast_slow_store": "WORKER_FAST_SLOW_STORE",
"upload_action_result": {
"ac_store": "GRPC_LOCAL_AC_STORE"
},
"work_directory": "/root/.cache/nativelink/work",
"platform_properties": {
"OSFamily": {
"values": ["Linux"]
},
"cpu_count": {
"query_cmd": "nproc"
},
// Need to specify a placeholder here otherwise Priority scheduling does
// not work.
"container-image": {
"values": ["placeholder"]
}
},
"experimental_precondition_script": "/root/worker_precondition_script.sh"
}
}],
deployment-examples/docker-compose-reclient/worker.json
line 48 at r8 (raw file):
"entrypoint_cmd": "/root/run_in_container.sh", "additional_environment": { "CONTAINER": {"Property": "container-image"},
nit: Lets name this NL_CONTAINER_IMAGE
for consistency.
deployment-examples/docker-compose-reclient/worker.json
line 49 at r8 (raw file):
"additional_environment": { "CONTAINER": {"Property": "container-image"}, "HOST_ROOT": {"Value": "${NATIVE_LINK_DIR}"},
fyi: I do have a local patch to make this normalized, and will have a proper way to get the worker's root directory.
deployment-examples/docker-compose-reclient/worker.json
line 49 at r8 (raw file):
"additional_environment": { "CONTAINER": {"Property": "container-image"}, "HOST_ROOT": {"Value": "${NATIVE_LINK_DIR}"},
nit: Lets name this NL_HOST_ROOT
for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: pre-commit-checks, vale (waiting on @adam-singer)
deployment-examples/docker-compose-reclient/README.md
line 34 at r8 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
I suggest modifying template (as stated above), then:
instance=$rbe_instance service=127.0.0.1:50052 log_format=reducedtext $auth_flags fail_early_min_action_count=4000 fail_early_min_fallback_ratio=0.5 deps_cache_max_mb=256 # TODO(b/276727504) Re-enable once noop build shutdown time bug is fixed # enable_deps_cache=true async_reproxy_termination=true use_unified_uploads=true fast_log_collection=true depsscanner_address=$depsscanner_address # Improve upload/download concurrency max_concurrent_streams_per_conn=50 max_concurrent_requests_per_conn=50 min_grpc_connections=50 cas_concurrency=1000 # TODO(allada) Native Link doesn't yet support compressed blob upload compression_threshold=-1 use_batches=false # Metric metadata metrics_namespace=$rbe_project # Disables both TLS and authentication service_no_security=true # Required to stop autoninja from complaining about authentication despite being # implied by service_no_security service_no_auth=true
Then in the readme ask people to modify
~/chromium/.gclient
with something like:solutions = [ { "name": "src", "url": "https://chromium.googlesource.com/chromium/src.git", "managed": False, "custom_deps": {}, "custom_vars": { "rbe_instance": "main", "rbe_project": "main", "depsscanner_address": "exec:///home/allada/chromium/src/buildtools/reclient/scandeps_server", }, }, ]
I'm going to see about pushing an upstream chrome change to make
remotebuildexecution.googleapis.com:443
a variable too. This would eventually streamline everything once we enable a few more features.
corrections:
instance=$rbe_instance
service=127.0.0.1:50051
log_format=reducedtext
# auth_flags # NativeLink chrome demo does not use credentails.
fail_early_min_action_count=4000
fail_early_min_fallback_ratio=0.5
deps_cache_max_mb=256
# TODO(b/276727504) Re-enable once noop build shutdown time bug is fixed
# enable_deps_cache=true
async_reproxy_termination=true
use_unified_uploads=true
fast_log_collection=true
depsscanner_address=$depsscanner_address
# Improve upload/download concurrency
max_concurrent_streams_per_conn=50
max_concurrent_requests_per_conn=50
min_grpc_connections=50
cas_concurrency=1000
# TODO(allada) Native Link doesn't yet support compressed blob upload
compression_threshold=-1
use_batches=false
# Metric metadata
metrics_namespace=$rbe_project
# Disables both TLS and authentication
service_no_security=true
# Required to stop autoninja from complaining about authentication despite being
# implied by service_no_security
service_no_auth=true
And
solutions = [
{
"name": "src",
"url": "https://chromium.googlesource.com/chromium/src.git",
"managed": False,
"custom_deps": {},
"custom_vars": {
"rbe_instance": "main",
"rbe_project": "main",
"depsscanner_address": "exec://%s/buildtools/reclient/scandeps_server",
},
},
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: pre-commit-checks, vale (waiting on @adam-singer)
deployment-examples/docker-compose-reclient/scheduler.json
line 23 at r8 (raw file):
}, "schedulers": { "MAIN_SCHEDULER": {
suggest:
"MAIN_SCHEDULER": {
"property_modifier": {
"modifications": [
{"add": {"name": "cpu_count", "value": "1"}}
],
"scheduler": {
"simple": {
"supported_platform_properties": {
"cpu_count": "minimum",
"OSFamily": "exact",
"InputRootAbsolutePath": "priority",
"container-image": "priority",
"label:action_default": "priority",
"label:action_large": "priority"
}
}
}
}
}
deployment-examples/docker-compose-reclient/worker.json
line 41 at r8 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
nit corrected for latest main changes:
"workers": [{ "local": { "worker_api_endpoint": { "uri": "grpc://${SCHEDULER_ENDPOINT:-127.0.0.1}:50061", }, "entrypoint": "/root/run_in_container.sh", "additional_environment": { "CONTAINER": { "property": "container-image" }, "HOST_ROOT": { "value": "${NATIVE_LINK_DIR}" } }, "cas_fast_slow_store": "WORKER_FAST_SLOW_STORE", "upload_action_result": { "ac_store": "GRPC_LOCAL_AC_STORE" }, "work_directory": "/root/.cache/nativelink/work", "platform_properties": { "OSFamily": { "values": ["Linux"] }, "cpu_count": { "query_cmd": "nproc" }, // Need to specify a placeholder here otherwise Priority scheduling does // not work. "container-image": { "values": ["placeholder"] } }, "experimental_precondition_script": "/root/worker_precondition_script.sh" } }],
Corrected again for what I think is better:
"workers": [{
"local": {
"worker_api_endpoint": {
"uri": "grpc://${SCHEDULER_ENDPOINT:-127.0.0.1}:50061",
},
"entrypoint": "/root/run_in_container.sh",
"additional_environment": {
"CONTAINER": {
"property": "container-image"
},
"HOST_ROOT": {
"value": "${NATIVE_LINK_DIR}"
}
},
"cas_fast_slow_store": "WORKER_FAST_SLOW_STORE",
"upload_action_result": {
"ac_store": "GRPC_LOCAL_AC_STORE"
},
"work_directory": "/root/.cache/nativelink/work",
"platform_properties": {
"OSFamily": {
"values": ["Linux"]
},
"cpu_count": {
"query_cmd": "nproc"
},
// Need to specify a placeholder here otherwise Priority scheduling does
// not work.
"container-image": {
"values": ["placeholder"]
},
"InputRootAbsolutePath": {
"values": ["placeholder"]
},
"label:action_default": {
"values": ["placeholder"]
},
"label:action_large": {
"values": ["placeholder"]
}
},
"experimental_precondition_script": "/root/worker_precondition_script.sh"
}
}],
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 8 files at r8, all commit messages.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: pre-commit-checks, vale (waiting on @adam-singer)
fd1072d
to
07b8adc
Compare
07b8adc
to
5597cfc
Compare
|
This information is now in the chromium example https://github.com/TraceMachina/nativelink/tree/main/deployment-examples/chromium |
This change is