Skip to content

Conversation

@SeungjinYang
Copy link
Collaborator

@SeungjinYang SeungjinYang commented Sep 3, 2025

If postgres DB supports enough connections for every process (executor, uvicorn worker, main loop), use a persistent connection instead of NullPool (which does connection setup/teardown for every DB op) to save on connection setup/teardown time.

We have been using NullPools to save on total # of connections at any given time because we didn't have visibility on the number of connections a PSQL instance supports. Now that we are able to get this number, we're able to make smart choices in the trade-offs to make.

Testing:

Running sky status on a remote API server on GKE with GCP cloud SQL backend, using IAM auth.

Initial few runs of sky status are discarded to fully load modules and initialize the db connections.

this PR
2.921
2.810
2.874
2.970
2.850
avg: 2.885

master
5.515
5.492
6.008
5.590
6.111
avg: 5.7432

This PR is approx. 2X faster.

Tested (run the relevant ones):

  • Code formatting: install pre-commit (auto-check on commit) or bash format.sh
  • Any manual or new tests for this PR (please specify below)
  • All smoke tests: /smoke-test (CI) or pytest tests/test_smoke.py (local)
  • Relevant individual tests: /smoke-test -k test_name (CI) or pytest tests/test_smoke.py::test_name (local)
  • Backward compatibility: /quicktest-core (CI) or pytest tests/smoke_tests/test_backward_compat.py (local)

@SeungjinYang
Copy link
Collaborator Author

SeungjinYang commented Sep 3, 2025

/smoke-test --aws --postgres https://buildkite.com/skypilot-1/smoke-tests/builds/3016 (PASS)

@SeungjinYang
Copy link
Collaborator Author

SeungjinYang commented Sep 3, 2025

/smoke-test -k test_managed_jobs_recovery_aws (PASS)
/smoke-test -k test_managed_jobs_storage (PASS)

@SeungjinYang
Copy link
Collaborator Author

/smoke-test --aws --remote-server

@SeungjinYang SeungjinYang marked this pull request as ready for review September 3, 2025 21:59
@SeungjinYang SeungjinYang requested review from aylei and cg505 September 3, 2025 21:59
@SeungjinYang SeungjinYang changed the title [db] further connection optimization [db] connection pool optimization - 2x speedup in remote postgres environment Sep 3, 2025
@rohansonecha rohansonecha self-requested a review September 3, 2025 22:01
Copy link
Collaborator

@rohansonecha rohansonecha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome! Thank you @SeungjinYang

Copy link
Collaborator

@aylei aylei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @SeungjinYang !

@SeungjinYang SeungjinYang merged commit cfb3029 into master Sep 4, 2025
19 checks passed
@SeungjinYang SeungjinYang deleted the db-conn-optimization branch September 4, 2025 03:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants