-
Notifications
You must be signed in to change notification settings - Fork 922
[Core] Add wait_for in Autostop config, check for active SSH sessions by default
#6361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core] Add wait_for in Autostop config, check for active SSH sessions by default
#6361
Conversation
|
/smoke-test -k test_autostop_with_ssh_sessions |
|
/smoke-test -k test_autostop |
|
/smoke-test -k test_autostop_with_ssh_sessions |
|
/smoke-test -k test_autostop_with_ssh_sessions --gcp |
|
/smoke-test -k test_autostop_with_ssh_sessions --aws |
|
/smoke-test -k test_autostop_wait_for_jobs --aws |
|
/smoke-test -k test_autostop_wait_for_jobs_and_ssh --gcp |
|
/smoke-test -k test_autostop_wait_for_none |
wait_for in Autostop config, check for active SSH sessions by default
|
/quicktest-core |
…unt-as-non-idle-autostop
|
/smoke-test -k test_autostop_wait_for_jobs |
…unt-as-non-idle-autostop
|
/smoke-test -k test_autostop_wait_for_jobs |
|
/smoke-test -k test_autostop_wait_for_jobs |
…unt-as-non-idle-autostop
|
/quicktest-core |
|
/smoke-test -k test_autostop_wait_for_jobs |
|
we should check the backward compatibility for diff client server version, considering we have some API changes? |
|
/quicktest-core |
This PR introduces a new Autostop config,
wait_for, which determines the condition to check for when resetting the idleness timer. This option works in conjunction withidle_minutes. This config is an enum:With the introduction of "none" for doing hard stops, we should also rename
idle_minutes, as it might be confusing for users. We can do that in a follow-up PR, along with making the field accept arbitrary time units (not only minutes). I've created #6382 to track that.Context:
Today, idleness for autodown/stop is measured from the completion of all tasks in the cluster's queue. However, a user may be SSH'd in, for example to debug some stuff. So we should count SSH activity as non-idle.
This PR implements this by listing the contents of
/dev/pts(the virtual filesystem where these pseudo-terminal devices reside) and counting them (excludingptmx).Initially, we considered using who, which is part of GNU coreutils. The command is simple, it just prints a list of users who are currently logged in. But we found out that SSH sessions from Cursor (probably VS Code too) did not show up under
who(thanks @concretevitamin for reminding to test this path!), likely because their Remote-SSH extension does not spawn an interactive login process.Another approach considered was:
Tested (run the relevant ones):
bash format.sh- New smoke tests:
test_autostop_wait_for_jobs_and_ssh,test_autostop_wait_for_none- Manually tested by ssh'ing from the terminal
/smoke-test(CI) orpytest tests/test_smoke.py(local)/smoke-test -k test_name(CI) orpytest tests/test_smoke.py::test_name(local)/quicktest-core(CI) orpytest tests/smoke_tests/test_backward_compat.py(local)