Skip to content

Cap database connection pool size for table-diff and repset-diff#101

Merged
mason-sharp merged 3 commits intomainfrom
fix/ACE-180/conn-concurrency
Apr 6, 2026
Merged

Cap database connection pool size for table-diff and repset-diff#101
mason-sharp merged 3 commits intomainfrom
fix/ACE-180/conn-concurrency

Conversation

@mason-sharp
Copy link
Copy Markdown
Member

Previously, pgxpool defaulted MaxConns to max(4, NumCPU), meaning the connection pool was unbounded relative to the concurrency factor. On high-core machines this could exhaust database connections.

Pool size is now derived from the concurrency factor (NumCPU × ConcurrencyFactor, minimum 4). A new --max-connections / -m flag (and table_diff.max_connections in ace.yaml) lets users set a hard cap that overrides the derived value.

Adds integration test that sets MaxConnections=2 with 16 workers and asserts the cap holds strictly.

Previously, pgxpool defaulted MaxConns to max(4, NumCPU), meaning
the connection pool was unbounded relative to the concurrency factor.
On high-core machines this could exhaust database connections.

Pool size is now derived from the concurrency factor (NumCPU ×
ConcurrencyFactor, minimum 4). A new --max-connections / -m flag
(and table_diff.max_connections in ace.yaml) lets users set a hard
cap that overrides the derived value.

Adds integration test that sets MaxConnections=2 with 16 workers
and asserts the cap holds strictly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 6, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c16f1e62-e94b-48a9-87d8-f2758d7a0902

📥 Commits

Reviewing files that changed from the base of the PR and between 2c33ea3 and 092e5b8.

📒 Files selected for processing (5)
  • internal/api/http/handler.go
  • internal/consistency/diff/repset_diff.go
  • internal/consistency/diff/schema_diff.go
  • internal/consistency/diff/table_diff.go
  • internal/jobs/config.go
✅ Files skipped from review due to trivial changes (1)
  • internal/api/http/handler.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • internal/consistency/diff/repset_diff.go
  • internal/jobs/config.go
  • internal/consistency/diff/table_diff.go
  • internal/consistency/diff/schema_diff.go

📝 Walkthrough

Walkthrough

Added a new max_connections setting (default 0) that caps DB connections per node for diff operations. The option is exposed in ace.yaml, CLI flags, HTTP/OpenAPI requests, task structs, connection-pool sizing logic, job builders, tests, CI, and documentation.

Changes

Cohort / File(s) Summary
Configuration
ace.yaml, pkg/config/config.go
Added table_diff.max_connections (yaml:"max_connections", default 0) to configuration types and schema.
CLI & Job wiring
internal/cli/cli.go, internal/jobs/config.go
Added shared --max-connections flag and wired parsed value into task MaxConnections; job builders read max_connections arg and fall back to cfg.TableDiff.MaxConnections when task value is 0.
API / HTTP / OpenAPI
internal/api/http/handler.go, docs/openapi.yaml, docs/http-api.md
Added max_connections request field to table-diff, schema-diff, repset-diff handlers and OpenAPI schemas; server resolves requested value, falling back to configured value.
Core Diff Logic
internal/consistency/diff/table_diff.go, internal/consistency/diff/schema_diff.go, internal/consistency/diff/repset_diff.go
Added MaxConnections int fields to task/command types; Validate() rejects negatives and derives from config when zero; propagated value into tdTask clones; implemented maxPoolSize() and used it to set DB pool size.
Tests & CI
tests/integration/repset_diff_conntrack_test.go, .github/workflows/test.yml
Added integration test TestRepsetDiff_MaxConnectionsCap and connection-monitor helper to assert per-node connection cap and detect leaks; CI workflow step added to run that test.
Documentation
docs/api.md, docs/commands/diff/table-diff.md, docs/commands/diff/repset-diff.md, docs/configuration.md, docs/design/table_diff.md, docs/performance.md
Documented new --max-connections flag (aliases noted in docs) and table_diff.max_connections behavior, default derivation, and override semantics relative to concurrency factor.

Poem

🐇 I count each hop and every thread,
I tuck excess connections into bed,
A tiny cap, a gentle bound,
Two per node (or what you found),
The diff hums on — no leaks, just peace.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 22.22% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding connection pool size caps for table-diff and repset-diff commands.
Description check ✅ Passed The description is clearly related to the changeset, explaining the problem (unbounded pool size), solution (derive from concurrency factor with cap), and implementation (new flag and config option).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/ACE-180/conn-concurrency

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link
Copy Markdown

codacy-production bot commented Apr 6, 2026

Up to standards ✅

🟢 Issues 2 medium

Results:
2 new issues

Category Results
Complexity 2 medium

View in Codacy

🟢 Metrics 23 complexity · 3 duplication

Metric Results
Complexity 23
Duplication 3

View in Codacy

TIP This summary will be updated as you push new changes. Give us feedback

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
docs/performance.md (1)

37-39: Consider simplifying wording.

The documentation content is excellent and provides clear guidance on when to use --max-connections. Minor stylistic improvement: "a large number of connections" could be simplified to "many connections" for conciseness.

✏️ Suggested wording tweak
- **`--max-connections`**
-    Caps the number of database connections ACE opens per node. By default, the pool size is derived from `--concurrency-factor` and the number of CPUs. On machines with many cores, this can result in a large number of connections. Use `--max-connections` to set a hard upper limit, or set `table_diff.max_connections` in `ace.yaml` to apply it globally. This is especially useful when running ACE against databases with limited `max_connections` or when sharing the database with other applications.
+ **`--max-connections`**
+    Caps the number of database connections ACE opens per node. By default, the pool size is derived from `--concurrency-factor` and the number of CPUs. On machines with many cores, this can result in many connections. Use `--max-connections` to set a hard upper limit, or set `table_diff.max_connections` in `ace.yaml` to apply it globally. This is especially useful when running ACE against databases with limited `max_connections` or when sharing the database with other applications.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/performance.md` around lines 37 - 39, Update the phrasing in the docs
entry for `--max-connections` to replace "a large number of connections" with
the more concise "many connections", keeping the rest of the content and
references to `--concurrency-factor`, `table_diff.max_connections`, and
`ace.yaml` unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/http-api.md`:
- Line 96: The docs advertise a `max_connections` request field that the server
doesn't accept; remove (or mark as unsupported) every `max_connections` entry in
docs/http-api.md so clients aren't misled, and do not add it back until the
corresponding request structs in internal/api/http/handler.go (the request types
used by the affected endpoints) expose a MaxConnections field; alternatively, if
you intend to support it now, add a MaxConnections json-tagged field to the
relevant request structs in internal/api/http/handler.go and wire it through the
handler logic, then update the docs.

In `@docs/openapi.yaml`:
- Around line 468-470: The OpenAPI schema added an unsupported max_connections
field (max_connections / MaxConnections) that the HTTP layer doesn't accept;
either remove these schema entries from docs/openapi.yaml (the three
occurrences) or implement the corresponding MaxConnections field and JSON/tag
handling in the request structs and handlers under internal/api/http/handler.go
so the server will accept and parse it; if you choose to implement, add
MaxConnections to the relevant request types with the correct
json:"max_connections" tag, update any validation/decoding logic in the handler
functions that consume those structs, and add tests to cover the new field.

In `@internal/cli/cli.go`:
- Around line 105-110: The new --max-connections flag is exposed via diffFlags
but not used by schema-diff (it's silently ignored); either remove the IntFlag
from diffFlags so schema-diff cannot parse it, or wire it end-to-end: add
handling in SchemaDiffCLI to copy the parsed "--max-connections" value into the
SchemaDiff task object and update buildSchemaDiffJob to read that task field
(max_connections) when constructing the job. Reference diffFlags,
schemaDiffFlags, SchemaDiffCLI, buildSchemaDiffJob and the "max-connections"
flag when making the change.

In `@internal/consistency/diff/table_diff.go`:
- Around line 1051-1064: TableDiffTask currently ignores the configured ace.yaml
cap because maxPoolSize() only uses t.MaxConnections (which stays 0 for
interactive TableDiffCLI); update maxPoolSize() (or connOpts()) to respect the
global fallback by using cfg.TableDiff.MaxConnections when t.MaxConnections == 0
(i.e. if t.MaxConnections <= 0, substitute the configured cap), so
TableDiffTask.maxPoolSize() and TableDiffTask.connOpts() enforce the ace.yaml
max_connections cap for both scheduled jobs and direct table-diff runs.

---

Nitpick comments:
In `@docs/performance.md`:
- Around line 37-39: Update the phrasing in the docs entry for
`--max-connections` to replace "a large number of connections" with the more
concise "many connections", keeping the rest of the content and references to
`--concurrency-factor`, `table_diff.max_connections`, and `ace.yaml` unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d196668f-3dbd-4e00-b350-3730dfca94be

📥 Commits

Reviewing files that changed from the base of the PR and between 90872e7 and 070e68a.

📒 Files selected for processing (15)
  • ace.yaml
  • docs/api.md
  • docs/commands/diff/repset-diff.md
  • docs/commands/diff/table-diff.md
  • docs/configuration.md
  • docs/design/table_diff.md
  • docs/http-api.md
  • docs/openapi.yaml
  • docs/performance.md
  • internal/cli/cli.go
  • internal/consistency/diff/repset_diff.go
  • internal/consistency/diff/table_diff.go
  • internal/jobs/config.go
  • pkg/config/config.go
  • tests/integration/repset_diff_conntrack_test.go

Addresses CodeRabbit review findings:

- Add MaxConnections field to HTTP request structs for table-diff,
  schema-diff, and repset-diff endpoints with resolveMaxConnections
  helper, so API clients can set the connection cap.
- Add MaxConnections to SchemaDiffCmd (struct, per-table wiring,
  CloneForSchedule) and schema-diff CLI/jobs builders.
- Apply ace.yaml table_diff.max_connections fallback in
  TableDiffTask.Validate() so the config cap is honored on all
  paths (CLI, HTTP, scheduler), not just scheduled jobs.
- Validate max_connections >= 1 when set (reject negative values).
- Change alias from -m to -M to avoid collision with mtree's
  --max-cpu-ratio alias.
- Add TestRepsetDiff_MaxConnectionsCap to CI workflow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/api/http/handler.go`:
- Around line 298-306: resolveMaxConnections currently treats any value <1 as
unset and silently falls back, so change APIServer.resolveMaxConnections to
detect negative requested values and return an error (e.g., (int, error)) when
requested < 0; keep the existing behavior for requested == 0 and cfg fallback,
and return the resolved int with nil error for valid positive inputs. Update the
diff endpoint handlers that call resolveMaxConnections to check the returned
error and respond with HTTP 400 when negative input is detected, ensuring the
pool cap validation fails fast. Ensure the function name resolveMaxConnections
is updated where referenced so callers handle the new error return.

In `@internal/consistency/diff/schema_diff.go`:
- Line 527: SchemaDiffCmd currently copies MaxConnections into each tdTask
(tdTask.MaxConnections = task.MaxConnections) but SchemaDiffCmd.Validate() does
not reject negative values, causing failures later in tdTask.Validate(); update
SchemaDiffCmd.Validate() to mirror the guard used in table-diff by checking
SchemaDiffCmd.MaxConnections and returning a validation error if it's < 0 (and
similarly for any other numeric limits applied later), so invalid MaxConnections
fails fast before the fan-out to tdTask.

In `@internal/consistency/diff/table_diff.go`:
- Around line 732-737: The code currently only copies
cfg.TableDiff.MaxConnections when it's > 0, allowing negative values in the
config to be ignored and bypass validation; change the logic to either (A) copy
any non-zero config value so negatives reach validation (replace the > 0 check
with != 0 when assigning t.MaxConnections from cfg.TableDiff.MaxConnections), or
(B) explicitly reject a negative config early (check
cfg.TableDiff.MaxConnections < 0 and return an error) before falling back to
derived values; apply this change around the t.MaxConnections assignment and the
subsequent t.MaxConnections < 0 validation so negative values are surfaced
(references: t.MaxConnections, cfg.TableDiff.MaxConnections).

In `@internal/jobs/config.go`:
- Around line 112-114: The code currently treats negative returns from intArg as
"unset" because branches only accept values > 0; change this to reject negatives
by validating intArg results before persisting into base.MaxConnections (and the
other spots handling max_connections) — if v < 0 return/propagate a validation
error instead of ignoring it. Implement this validation in a shared helper
(e.g., validateNonNegativeIntArg or extend intArg to accept a minimum allowed
value) and use that helper when building scheduled tasks so any max_connections
< 0 fails fast rather than defaulting to derived pool sizes; update all
occurrences (the other max_connections branches) to use the same helper.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e51ef3ba-a44c-4efa-803f-95fd3c1d6257

📥 Commits

Reviewing files that changed from the base of the PR and between 070e68a and 2c33ea3.

📒 Files selected for processing (9)
  • .github/workflows/test.yml
  • docs/api.md
  • docs/commands/diff/repset-diff.md
  • docs/commands/diff/table-diff.md
  • internal/api/http/handler.go
  • internal/cli/cli.go
  • internal/consistency/diff/schema_diff.go
  • internal/consistency/diff/table_diff.go
  • internal/jobs/config.go
✅ Files skipped from review due to trivial changes (4)
  • docs/commands/diff/repset-diff.md
  • docs/commands/diff/table-diff.md
  • docs/api.md
  • .github/workflows/test.yml
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/cli/cli.go

…aths

- Let negative values pass through resolvers and config fallbacks
  instead of silently treating them as unset, so Validate() rejects
  them consistently.
- Add max_connections validation to RepsetDiffCmd.Validate() and
  SchemaDiffCmd.Validate() to fail fast before per-table fan-out.
- Simplify jobs config assignments to unconditional intArg calls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| `--concurrency-factor <float>` | `-c` | CPU ratio for concurrency (0.0–4.0). Default `0.5`. |
| `--compare-unit-size <int>` | `-u` | Recursive split size for mismatched blocks. Default `10000`. |
| `--output <json\|html>` | `-o` | Per-table diff report format. Default `json`. |
| `--max-connections <int>` | `-m` | Maximum database connections per node. Caps the pool regardless of concurrency factor. | derived |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTC. Maybe it makes sense to document how max-coonections counteract the floor value.

@mason-sharp mason-sharp merged commit b06ad91 into main Apr 6, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants