-
Notifications
You must be signed in to change notification settings - Fork 922
[Core][Serve] Add gRPC code to SkyServe #6702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/quicktest-core |
|
smoketest, and smoketest serve pass @kevinmingtarja I think I'll have to upgrade the api server version (or maybe skylet version?). I am getting backwards compatibility issues because skylet doesn't have the Serve implementation for grpc (nevertheless, |
Yeah good catch, I'm also facing that problem in my PR, and I think upgrading the skylet version is not enough, because we don't auto-upgrade skylet/clusters that are launched by an older API server (we have to re-run
I was planning on just doing (1) for now in #6647, and leave (2) for a follow up PR. |
Got it. Also, I feel it will be helpful for us to have some kind of standards on catching errors that arise with grpc.
The existing codepath exposes CommandError and FetchClusterInfoError, but handling them is extremely inconsistent (some errors are wrapped as RuntimeExceptions while others are just exposed), which makes it really confusing to develop. |
|
/smoke-test --serve |
|
/smoke-test --kubernetes |
I saw that the |
|
/smoke-test -k test_launch_fast_with_autostop --aws --jobs-consolidation |
|
/smoke-test --serve |
|
/quicktest-core |
|
/smoke-test --serve --grpc |
|
/smoke-test --serve --kubernetes --grpc |
kevinmingtarja
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @kyuds ! Looks solid overall, just a few nits and discussion points.
kevinmingtarja
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Should we run quicktest once more as a sanity check?
|
/quicktest-core |
|
rerunning failed smoke tests (flaky, due to provisioning errors which are unrelated) |
|
smoke-tests passing (current failures due to provisioning errors/manual cancel) linking buildkite results below (commit to pre-review commit, but review only was concerned about renaming, etc): |
Replaces
ServeCodeGenwith a gRPC service in skylet.TODO:
Notes:
RpcRunnerto have more fine-grained control over monkey-patching. Otherwise, a lot of tests were working due to "side-effects" of monkey-patches.serve_rpc_utilsdue to circular imports (backend_utils)Tested (run the relevant ones):
bash format.sh/smoke-test(CI) orpytest tests/test_smoke.py(local)/smoke-test -k test_name(CI) orpytest tests/test_smoke.py::test_name(local)/quicktest-core(CI) orpytest tests/smoke_tests/test_backward_compat.py(local)