Cap database connection pool size for table-diff and repset-diff#101
Cap database connection pool size for table-diff and repset-diff#101mason-sharp merged 3 commits intomainfrom
Conversation
Previously, pgxpool defaulted MaxConns to max(4, NumCPU), meaning the connection pool was unbounded relative to the concurrency factor. On high-core machines this could exhaust database connections. Pool size is now derived from the concurrency factor (NumCPU × ConcurrencyFactor, minimum 4). A new --max-connections / -m flag (and table_diff.max_connections in ace.yaml) lets users set a hard cap that overrides the derived value. Adds integration test that sets MaxConnections=2 with 16 workers and asserts the cap holds strictly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (4)
📝 WalkthroughWalkthroughAdded a new max_connections setting (default 0) that caps DB connections per node for diff operations. The option is exposed in ace.yaml, CLI flags, HTTP/OpenAPI requests, task structs, connection-pool sizing logic, job builders, tests, CI, and documentation. Changes
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Up to standards ✅🟢 Issues
|
| Category | Results |
|---|---|
| Complexity | 2 medium |
🟢 Metrics 23 complexity · 3 duplication
Metric Results Complexity 23 Duplication 3
TIP This summary will be updated as you push new changes. Give us feedback
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
docs/performance.md (1)
37-39: Consider simplifying wording.The documentation content is excellent and provides clear guidance on when to use
--max-connections. Minor stylistic improvement: "a large number of connections" could be simplified to "many connections" for conciseness.✏️ Suggested wording tweak
- **`--max-connections`** - Caps the number of database connections ACE opens per node. By default, the pool size is derived from `--concurrency-factor` and the number of CPUs. On machines with many cores, this can result in a large number of connections. Use `--max-connections` to set a hard upper limit, or set `table_diff.max_connections` in `ace.yaml` to apply it globally. This is especially useful when running ACE against databases with limited `max_connections` or when sharing the database with other applications. + **`--max-connections`** + Caps the number of database connections ACE opens per node. By default, the pool size is derived from `--concurrency-factor` and the number of CPUs. On machines with many cores, this can result in many connections. Use `--max-connections` to set a hard upper limit, or set `table_diff.max_connections` in `ace.yaml` to apply it globally. This is especially useful when running ACE against databases with limited `max_connections` or when sharing the database with other applications.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/performance.md` around lines 37 - 39, Update the phrasing in the docs entry for `--max-connections` to replace "a large number of connections" with the more concise "many connections", keeping the rest of the content and references to `--concurrency-factor`, `table_diff.max_connections`, and `ace.yaml` unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/http-api.md`:
- Line 96: The docs advertise a `max_connections` request field that the server
doesn't accept; remove (or mark as unsupported) every `max_connections` entry in
docs/http-api.md so clients aren't misled, and do not add it back until the
corresponding request structs in internal/api/http/handler.go (the request types
used by the affected endpoints) expose a MaxConnections field; alternatively, if
you intend to support it now, add a MaxConnections json-tagged field to the
relevant request structs in internal/api/http/handler.go and wire it through the
handler logic, then update the docs.
In `@docs/openapi.yaml`:
- Around line 468-470: The OpenAPI schema added an unsupported max_connections
field (max_connections / MaxConnections) that the HTTP layer doesn't accept;
either remove these schema entries from docs/openapi.yaml (the three
occurrences) or implement the corresponding MaxConnections field and JSON/tag
handling in the request structs and handlers under internal/api/http/handler.go
so the server will accept and parse it; if you choose to implement, add
MaxConnections to the relevant request types with the correct
json:"max_connections" tag, update any validation/decoding logic in the handler
functions that consume those structs, and add tests to cover the new field.
In `@internal/cli/cli.go`:
- Around line 105-110: The new --max-connections flag is exposed via diffFlags
but not used by schema-diff (it's silently ignored); either remove the IntFlag
from diffFlags so schema-diff cannot parse it, or wire it end-to-end: add
handling in SchemaDiffCLI to copy the parsed "--max-connections" value into the
SchemaDiff task object and update buildSchemaDiffJob to read that task field
(max_connections) when constructing the job. Reference diffFlags,
schemaDiffFlags, SchemaDiffCLI, buildSchemaDiffJob and the "max-connections"
flag when making the change.
In `@internal/consistency/diff/table_diff.go`:
- Around line 1051-1064: TableDiffTask currently ignores the configured ace.yaml
cap because maxPoolSize() only uses t.MaxConnections (which stays 0 for
interactive TableDiffCLI); update maxPoolSize() (or connOpts()) to respect the
global fallback by using cfg.TableDiff.MaxConnections when t.MaxConnections == 0
(i.e. if t.MaxConnections <= 0, substitute the configured cap), so
TableDiffTask.maxPoolSize() and TableDiffTask.connOpts() enforce the ace.yaml
max_connections cap for both scheduled jobs and direct table-diff runs.
---
Nitpick comments:
In `@docs/performance.md`:
- Around line 37-39: Update the phrasing in the docs entry for
`--max-connections` to replace "a large number of connections" with the more
concise "many connections", keeping the rest of the content and references to
`--concurrency-factor`, `table_diff.max_connections`, and `ace.yaml` unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d196668f-3dbd-4e00-b350-3730dfca94be
📒 Files selected for processing (15)
ace.yamldocs/api.mddocs/commands/diff/repset-diff.mddocs/commands/diff/table-diff.mddocs/configuration.mddocs/design/table_diff.mddocs/http-api.mddocs/openapi.yamldocs/performance.mdinternal/cli/cli.gointernal/consistency/diff/repset_diff.gointernal/consistency/diff/table_diff.gointernal/jobs/config.gopkg/config/config.gotests/integration/repset_diff_conntrack_test.go
Addresses CodeRabbit review findings: - Add MaxConnections field to HTTP request structs for table-diff, schema-diff, and repset-diff endpoints with resolveMaxConnections helper, so API clients can set the connection cap. - Add MaxConnections to SchemaDiffCmd (struct, per-table wiring, CloneForSchedule) and schema-diff CLI/jobs builders. - Apply ace.yaml table_diff.max_connections fallback in TableDiffTask.Validate() so the config cap is honored on all paths (CLI, HTTP, scheduler), not just scheduled jobs. - Validate max_connections >= 1 when set (reject negative values). - Change alias from -m to -M to avoid collision with mtree's --max-cpu-ratio alias. - Add TestRepsetDiff_MaxConnectionsCap to CI workflow. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/api/http/handler.go`:
- Around line 298-306: resolveMaxConnections currently treats any value <1 as
unset and silently falls back, so change APIServer.resolveMaxConnections to
detect negative requested values and return an error (e.g., (int, error)) when
requested < 0; keep the existing behavior for requested == 0 and cfg fallback,
and return the resolved int with nil error for valid positive inputs. Update the
diff endpoint handlers that call resolveMaxConnections to check the returned
error and respond with HTTP 400 when negative input is detected, ensuring the
pool cap validation fails fast. Ensure the function name resolveMaxConnections
is updated where referenced so callers handle the new error return.
In `@internal/consistency/diff/schema_diff.go`:
- Line 527: SchemaDiffCmd currently copies MaxConnections into each tdTask
(tdTask.MaxConnections = task.MaxConnections) but SchemaDiffCmd.Validate() does
not reject negative values, causing failures later in tdTask.Validate(); update
SchemaDiffCmd.Validate() to mirror the guard used in table-diff by checking
SchemaDiffCmd.MaxConnections and returning a validation error if it's < 0 (and
similarly for any other numeric limits applied later), so invalid MaxConnections
fails fast before the fan-out to tdTask.
In `@internal/consistency/diff/table_diff.go`:
- Around line 732-737: The code currently only copies
cfg.TableDiff.MaxConnections when it's > 0, allowing negative values in the
config to be ignored and bypass validation; change the logic to either (A) copy
any non-zero config value so negatives reach validation (replace the > 0 check
with != 0 when assigning t.MaxConnections from cfg.TableDiff.MaxConnections), or
(B) explicitly reject a negative config early (check
cfg.TableDiff.MaxConnections < 0 and return an error) before falling back to
derived values; apply this change around the t.MaxConnections assignment and the
subsequent t.MaxConnections < 0 validation so negative values are surfaced
(references: t.MaxConnections, cfg.TableDiff.MaxConnections).
In `@internal/jobs/config.go`:
- Around line 112-114: The code currently treats negative returns from intArg as
"unset" because branches only accept values > 0; change this to reject negatives
by validating intArg results before persisting into base.MaxConnections (and the
other spots handling max_connections) — if v < 0 return/propagate a validation
error instead of ignoring it. Implement this validation in a shared helper
(e.g., validateNonNegativeIntArg or extend intArg to accept a minimum allowed
value) and use that helper when building scheduled tasks so any max_connections
< 0 fails fast rather than defaulting to derived pool sizes; update all
occurrences (the other max_connections branches) to use the same helper.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e51ef3ba-a44c-4efa-803f-95fd3c1d6257
📒 Files selected for processing (9)
.github/workflows/test.ymldocs/api.mddocs/commands/diff/repset-diff.mddocs/commands/diff/table-diff.mdinternal/api/http/handler.gointernal/cli/cli.gointernal/consistency/diff/schema_diff.gointernal/consistency/diff/table_diff.gointernal/jobs/config.go
✅ Files skipped from review due to trivial changes (4)
- docs/commands/diff/repset-diff.md
- docs/commands/diff/table-diff.md
- docs/api.md
- .github/workflows/test.yml
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/cli/cli.go
…aths - Let negative values pass through resolvers and config fallbacks instead of silently treating them as unset, so Validate() rejects them consistently. - Add max_connections validation to RepsetDiffCmd.Validate() and SchemaDiffCmd.Validate() to fail fast before per-table fan-out. - Simplify jobs config assignments to unconditional intArg calls. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
docs/commands/diff/repset-diff.md
Outdated
| | `--concurrency-factor <float>` | `-c` | CPU ratio for concurrency (0.0–4.0). Default `0.5`. | | ||
| | `--compare-unit-size <int>` | `-u` | Recursive split size for mismatched blocks. Default `10000`. | | ||
| | `--output <json\|html>` | `-o` | Per-table diff report format. Default `json`. | | ||
| | `--max-connections <int>` | `-m` | Maximum database connections per node. Caps the pool regardless of concurrency factor. | derived | |
There was a problem hiding this comment.
LGTC. Maybe it makes sense to document how max-coonections counteract the floor value.
Previously, pgxpool defaulted MaxConns to max(4, NumCPU), meaning the connection pool was unbounded relative to the concurrency factor. On high-core machines this could exhaust database connections.
Pool size is now derived from the concurrency factor (NumCPU × ConcurrencyFactor, minimum 4). A new --max-connections / -m flag (and table_diff.max_connections in ace.yaml) lets users set a hard cap that overrides the derived value.
Adds integration test that sets MaxConnections=2 with 16 workers and asserts the cap holds strictly.