-
Notifications
You must be signed in to change notification settings - Fork 921
[Perf] Remote API server checking network connection times out slowly #6263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
If we just need one of these to be live, could we also test it such that
instead of
? |
|
Will switch to that! |
| # `sky serve up`. If we have controller's head_ip available and it is ssh-reachable, | ||
| # `sky serve up`. If we have controller's head_ip available and it is ssh-reachable, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am so confused on what is happening here. Not blocking approval because it isn't important by any means - I'm just confused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea what happened here
|
If 8.8.8.8 is generally more reliable than 1.1.1.1 we can still keep it as the first tried entry |
Sometimes for
sky statusfor a remote api server, it checks the network connection by pinging endpoints. It tests two endpoints with 3 retries each for a three second timeout. This means thathttps://1.1.1.1is tested for 9 seconds before testing google's endpointhttps://8.8.8.8. This PR swaps the order (google's endpoint first) and then the timeout is reduced to 1s instead of 3s. This was tested without any jobs running.Tested (run the relevant ones):
bash format.sh/smoke-test(CI) orpytest tests/test_smoke.py(local)/smoke-test -k test_name(CI) orpytest tests/test_smoke.py::test_name(local)/quicktest-core(CI) orpytest tests/smoke_tests/test_backward_compat.py(local)