Skip to content

Analysis for async performance issues#4128

Open
paulmedynski wants to merge 6 commits intomainfrom
dev/paul/async-planning
Open

Analysis for async performance issues#4128
paulmedynski wants to merge 6 commits intomainfrom
dev/paul/async-planning

Conversation

@paulmedynski
Copy link
Copy Markdown
Contributor

Description

This PR contains lots of research, anaylsis, and summary artifacts related to a variety of async performance issues we currently have in the driver. It will be used and expanded on as we choose specific targets to work on.

The first effort will focus on Unix thread starvation due to interactions with login flows and query executions.

The PR also sets up the plans/ directory where we can start dumping plans and artifacts from other investigations, and tries to establish some style standards for Markdown.

Copilot AI review requested due to automatic review settings April 1, 2026 18:31
@paulmedynski paulmedynski requested a review from a team as a code owner April 1, 2026 18:31
@github-project-automation github-project-automation bot moved this to To triage in SqlClient Board Apr 1, 2026
@paulmedynski paulmedynski moved this from To triage to In review in SqlClient Board Apr 1, 2026
@paulmedynski paulmedynski added this to the 7.1.0-preview1 milestone Apr 1, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new plans/ workspace containing markdown-formatted research artifacts on async performance issues in Microsoft.Data.SqlClient, along with a new repro app project that’s included in the main solution.

Changes:

  • Introduces plans/async-performance/ research docs (root causes, issue inventory, prioritized recommendations, and detailed fix breakdowns).
  • Adds a new ThreadStarvation repro console app under plans/async-performance/apps/ and includes it in src/Microsoft.Data.SqlClient.slnx.
  • Adds/updates formatting and style guidance for markdown plan documents and C# coding style.

Reviewed changes

Copilot reviewed 72 out of 72 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/Microsoft.Data.SqlClient.slnx Adds the ThreadStarvation repro app project to the solution.
policy/coding-style.md Updates style guidance (wrapping rules; mandates braces for conditionals/loops).
plans/FORMATTING.md Defines markdown formatting + lint workflow for plans/.
plans/.markdownlint.jsonc Adds markdownlint configuration for plans/.
plans/async-performance/README.md High-level index + key findings for async performance research.
plans/async-performance/issue-summary.md Categorized inventory of relevant async performance issues/discussions.
plans/async-performance/root-causes.md Narrative root-cause analysis of async performance problems.
plans/async-performance/recommendations.md Prioritized recommendations and sequencing guidance.
plans/async-performance/connection-pool-redesign.md Summary of ChannelDbConnectionPool design/progress.
plans/async-performance/executive-summary-feature-42261.md Executive summary / delivery plan for Unix thread starvation feature.
plans/async-performance/apps/ThreadStarvation/ThreadStarvation.csproj New repro console app project definition.
plans/async-performance/apps/ThreadStarvation/EntryPoint.cs CLI parsing/dispatch to the repro harness.
plans/async-performance/apps/ThreadStarvation/SqlClientEventListener.cs EventSource listener for SqlClient diagnostics output.
plans/async-performance/01-connection-pool/summary.md P1 summary for completing ChannelDbConnectionPool.
plans/async-performance/01-connection-pool/priority-rationale.md Rationale for ranking pool work as top priority.
plans/async-performance/01-connection-pool/01-factory-routing.md Proposed fix plan for factory routing to pool v2.
plans/async-performance/01-connection-pool/02-lifecycle.md Proposed fix plan for startup/shutdown lifecycle.
plans/async-performance/01-connection-pool/03-clear.md Proposed fix plan for implementing Clear().
plans/async-performance/01-connection-pool/04-warmup.md Proposed fix plan for warmup behavior.
plans/async-performance/01-connection-pool/05-pruning.md Proposed fix plan for pruning behavior.
plans/async-performance/01-connection-pool/06-transactions.md Proposed fix plan for transaction enlistment support.
plans/async-performance/01-connection-pool/07-error-handling.md Proposed fix plan for error handling/backoff.
plans/async-performance/01-connection-pool/08-rate-limiting.md Proposed fix plan for opt-in connection creation rate limiting.
plans/async-performance/01-connection-pool/09-metrics.md Proposed fix plan for EventSource/metrics integration.
plans/async-performance/01-connection-pool/10-integration-tests.md Proposed plan for integration test coverage of pool v2.
plans/async-performance/02-tds-async-reads/summary.md P2 summary for addressing snapshot/replay overhead.
plans/async-performance/02-tds-async-reads/priority-rationale.md Rationale for ranking TDS async reads as P2.
plans/async-performance/02-tds-async-reads/01-continue-points.md Proposed “continue points” expansion approach.
plans/async-performance/02-tds-async-reads/02-plp-streaming.md Proposed PLP resume/streaming approach to avoid replay.
plans/async-performance/02-tds-async-reads/03-snapshot-buffer-pool.md Proposed snapshot buffer pooling approach.
plans/async-performance/02-tds-async-reads/04-async-no-snapshot.md Proposed “async without snapshot” for sequential access.
plans/async-performance/02-tds-async-reads/05-state-machine-parser.md Long-term state machine / pipelines-based parser strategy.
plans/async-performance/03-async-transactions/summary.md P3 summary for true async transaction APIs.
plans/async-performance/03-async-transactions/priority-rationale.md Rationale for ranking async transactions as P3.
plans/async-performance/03-async-transactions/01-async-tds-transaction.md Proposed async transaction manager request in TdsParser.
plans/async-performance/03-async-transactions/02-async-commit-rollback.md Proposed overrides for CommitAsync/RollbackAsync.
plans/async-performance/03-async-transactions/03-async-begin-transaction.md Proposed BeginTransactionAsync override guidance.
plans/async-performance/03-async-transactions/04-deferred-begin.md Proposed deferred BEGIN TRANSACTION optimization.
plans/async-performance/04-async-sni-opens/summary.md P4 summary for end-to-end async connection opening.
plans/async-performance/04-async-sni-opens/priority-rationale.md Rationale for ranking async SNI opens as P4.
plans/async-performance/04-async-sni-opens/01-async-managed-connect.md Proposed async factory/connect for managed SNI TCP.
plans/async-performance/04-async-sni-opens/02-async-ssl.md Proposed async TLS handshake wiring.
plans/async-performance/04-async-sni-opens/03-async-prelogin.md Proposed async prelogin handshake wiring.
plans/async-performance/04-async-sni-opens/04-async-login.md Proposed async login plan.
plans/async-performance/04-async-sni-opens/05-async-open-pipeline.md Proposed end-to-end async open orchestration.
plans/async-performance/05-allocation-reduction/summary.md P5 summary for allocation reduction opportunities.
plans/async-performance/05-allocation-reduction/priority-rationale.md Rationale for ranking allocation reduction as P5.
plans/async-performance/05-allocation-reduction/01-tds-buffer-pool.md Proposed pooling for in/out TDS buffers.
plans/async-performance/05-allocation-reduction/02-snapshot-buffer-pool.md Cross-reference for pooling snapshot packet data.
plans/async-performance/05-allocation-reduction/03-cancellation-optimization.md Proposed CancellationToken registration optimizations.
plans/async-performance/05-allocation-reduction/04-setchars-buffer-pool.md Proposed pooling for TVP char buffers.
plans/async-performance/05-allocation-reduction/05-semaphore-optimization.md Proposed replacing ConcurrentQueueSemaphore allocation pattern.
plans/async-performance/06-packet-locking/summary.md P6 summary for managed SNI packet locking redesign.
plans/async-performance/06-packet-locking/priority-rationale.md Rationale for ranking packet locking as P6.
plans/async-performance/06-packet-locking/01-tcp-async-locks.md Proposed SemaphoreSlim locking for SniTcpHandle.
plans/async-performance/06-packet-locking/02-np-async-locks.md Proposed SemaphoreSlim locking for SniNpHandle.
plans/async-performance/06-packet-locking/03-mars-send-flow.md Proposed MARS send flow control modernization.
plans/async-performance/06-packet-locking/04-mars-receive-lock.md Proposed MARS receive queue lock replacement.
plans/async-performance/07-async-tvp/summary.md P7 summary for async TVP data sources.
plans/async-performance/07-async-tvp/priority-rationale.md Rationale for ranking async TVPs as P7.
plans/async-performance/07-async-tvp/01-async-enumerable-tvp.md Proposed IAsyncEnumerable-based TVP acceptance.
plans/async-performance/07-async-tvp/02-async-row-writing.md Proposed async TVP row writing pipeline changes.
plans/async-performance/07-async-tvp/03-async-bulkcopy.md Proposed async BulkCopy enumeration support.
Directory.Packages.props Adjusts central package versions (impacts restore/build behavior).
.github/prompts/ado-work-item-clarification.prompt.md Updates ADO prompt guidance for forcing Markdown format updates.
.github/instructions/plans-formatting.instructions.md Adds repo instructions for plans/**/*.md formatting/linting.
.github/instructions/coding-style.instructions.md Adds repo instruction to follow policy/coding-style.md for C#.

@@ -0,0 +1,18 @@
<Project Sdk="Microsoft.NET.Sdk">
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This .csproj appears to be saved with a UTF-8 BOM (the invisible character before <Project> on line 1). policy/coding-style.md specifies text files should be UTF-8 encoded without BOM; please resave without BOM to align with repo standards and avoid tooling diffs.

Copilot uses AI. Check for mistakes.
Comment on lines +13 to +16
<PackageReference Include="Microsoft.Data.SqlClient" />
<PackageReference Include="Microsoft.Data.SqlClient.Extensions.Azure" />
<PackageReference Include="System.CommandLine" Version="2.0.3" VersionOverride="2.0.3" />
</ItemGroup>
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PackageReference for System.CommandLine specifies both Version and VersionOverride. With central package management enabled, this is redundant and may cause restore warnings/errors depending on NuGet/MSBuild settings. Prefer a single mechanism: either define System.CommandLine via PackageVersion (possibly in a local Directory.Packages.props for this app) and use VersionOverride only when needed, or remove VersionOverride and keep Version.

Copilot uses AI. Check for mistakes.
Comment on lines +24 to +25
3. **Large data reads are catastrophically slow in async** — the TDS snapshot/replay mechanism
causes exponential time growth with data size.
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Key finding #3 says the snapshot/replay mechanism "causes exponential time growth with data size", but the root-cause analysis in this same plan describes it as O(n²) (quadratic) in packet count. Exponential is materially different/misleading; please correct the wording to "quadratic" (or "super-linear / quadratic") to keep the summary technically accurate.

Copilot uses AI. Check for mistakes.
Comment on lines +21 to +22
<PackageVersion Include="Microsoft.Data.SqlClient" Version="7.0.0" />
<PackageVersion Include="Microsoft.Data.SqlClient.Extensions.Azure" Version="1.0.0" />
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PackageVersion entries for Microsoft.Data.SqlClient and Microsoft.Data.SqlClient.Extensions.Azure were added unconditionally. This creates two concrete restore/build hazards:

  1. When building with -p:ReferenceType=Package, the conditional ItemGroup below also defines versions for these same package IDs, resulting in duplicate PackageVersion items.
  2. doc/Directory.Packages.props imports the root props and also defines PackageVersion for these package IDs, which will now be duplicates as well.

To avoid breaking package-mode builds and doc builds, consider removing these unconditioned entries and instead scoping the versions to the plans/ app (e.g., add a Directory.Packages.props under plans/async-performance/apps/ThreadStarvation/ similar to doc/apps/AzureAuthentication/Directory.Packages.props, or use PackageVersion Update=... in the importing props where overriding is intended).

Suggested change
<PackageVersion Include="Microsoft.Data.SqlClient" Version="7.0.0" />
<PackageVersion Include="Microsoft.Data.SqlClient.Extensions.Azure" Version="1.0.0" />

Copilot uses AI. Check for mistakes.
version specified.
-->
<PackageVersion Include="Microsoft.Data.SqlClient" Version="7.0.0" />
<PackageVersion Include="Microsoft.Data.SqlClient.Extensions.Azure" Version="1.0.0" />
Copy link
Copy Markdown
Member

@cheenamalhotra cheenamalhotra Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why reference MDS v7.0.0 and Azure extension 1.0.0 in project reference?

Comment on lines +51 to +53
4. **The key insight:** PLP data is append-only. We never need to "re-interpret" already-read bytes
— we just need to copy them from packets to the output buffer. On resume, we can skip the copy
for already-processed data.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also noticed that the PLP chunks are usually sized to fit into a single packet. If the reassembled TDS packet contains nothing other than a complete PLP chunk, then we can steal that packet buffer, slice it up to form a ReadOnlyMemory and append that ReadOnlyMemory instance to the list of PLP chunks. This saves a buffer copy (although we would still need to allocate a byte array when we replace the TDS packet buffer.)

### Phase 4: Remove Snapshot Mechanism

- Once all paths are async-native, remove `StateSnapshot` entirely
- Remove `SnapshotStatus`, `_snapshot`, `_cachedSnapshot`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the AAsyncCallContext-derived classes?


### Phase 3: Convert Token Processing to Async

- `TryProcessToken_*` methods become async
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not strictly necessary for this phase, but I'd really like to see these methods turned into proper .NET types which read their data from (or write their data to) a PipeReader/PipeWriter. It's a very large part of TdsParser.cs, and the restructuring would make testing these changes a lot easier.


### Phase 2: Convert Leaf Methods to Async

- Start with `TryReadByte`, `TryReadInt16`, etc.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the write path, it may also be worth refactoring TdsValueSetter beforehand - there's some duplicate write logic there.


- `TryRun()` calls dozens of `TryRead*` methods — all must become async
- Sync callers need a sync wrapper (or separate sync path)
- Deep async call chains have overhead (state machine allocation per frame)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In (at least) netcore, if the ValueTask completes synchronously then there's no heap allocation. There may be less overhead here than we expect.

### 3. Cold-Start Amplification

Issue #601 reports that opening 100 connections in parallel takes minutes because connections are
created serially. P1 enables parallel creation, but if each creation still blocks a thread, the
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing P1 could actually result in worse application startup performance until P4 is fixed because we'd be blocking large parts of the entire application's thread pool while the connection pool populates. This is particularly likely to be a problem in cases where an application sets the minimum pool size to more than the number of threads (which is quite likely - people will expect an async SqlClient to be network-bound, so a large minimum pool size would make intuitive sense.)

Perhaps it'd be safer to temporarily limit the degree of parallelism on P1 until P4 is complete? Alternatively, steps 1 & 2 in the sequencing consideration could be reversed.


## Problem

`ConcurrentQueueSemaphore` (used in managed SNI for stream-level read/write locking) allocates a
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: when wrapping an SniNetworkStream with an SniSslStream, we only need to lock on the outer ConcurrentQueueSemaphore.

- `SqlSequentialTextReader.cs` lines 363-421 — text streaming
- `TdsParser.cs` lines 1426, 3560, 3573, 12962-13082 — packet and value buffers

### Not Pooled — Allocation Hotspots
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separately, there's a remaining piece of work to remove the boxing of SqlGuid values.


- Opt-in: users must explicitly provide `IAsyncEnumerable<SqlDataRecord>`
- Existing sync paths unchanged
- Requires `#if NET` guard — `IAsyncEnumerable` not available on .NET Framework
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IAsyncEnumerable is available in Microsoft.Bcl.AsyncInterfaces, would this not work?

- The property implements custom caching logic (not the standard `AcquireAndReturn` helper) and
also checks `OperatingSystem.IsWindows()` at runtime. Setting the switch on a non-Windows OS
has no effect.
- When `ILLink.Substitutions.xml` is used for trimming, the switch value may be baked in at
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I introduced this - the plan will eventually be to have a custom .targets file which allows someone to specify set an MSBuild property in their project file and drop the dependency on the native SNI DLL entirely (or to enable single-file applications to work.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

4 participants