Skip to content

Fix async release before HTLC decode#4548

Open
valentinewallace wants to merge 1 commit intolightningdevkit:mainfrom
valentinewallace:2026-04-async-held-htlc-ordering-v2
Open

Fix async release before HTLC decode#4548
valentinewallace wants to merge 1 commit intolightningdevkit:mainfrom
valentinewallace:2026-04-async-held-htlc-ordering-v2

Conversation

@valentinewallace
Copy link
Copy Markdown
Contributor

Handle ReleaseHeldHtlc messages that arrive before the sender-side LSP has even queued the held HTLC for onion decoding. Unlike #4106, which covers releases arriving after the HTLC is in decode_update_add_htlcs but before it reaches pending_intercepted_htlcs, this preserves releases that arrive one step earlier and would otherwise be dropped as HTLC not found.

Supersedes #4542

@valentinewallace valentinewallace added this to the 0.3 milestone Apr 8, 2026
@valentinewallace valentinewallace self-assigned this Apr 8, 2026
@valentinewallace valentinewallace added the weekly goal Someone wants to land this this week label Apr 8, 2026
@ldk-reviews-bot
Copy link
Copy Markdown

ldk-reviews-bot commented Apr 8, 2026

👋 Thanks for assigning @TheBlueMatt as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@ldk-claude-review-bot
Copy link
Copy Markdown
Collaborator

ldk-claude-review-bot commented Apr 8, 2026

After thoroughly reviewing every file and hunk in this PR, including tracing all race condition windows, lock ordering, and state machine transitions:

Review Summary

No new issues found beyond those already flagged in prior reviews.

Prior comment status update

  • lightning/src/ln/channel.rs:8157Resolved: the comment now correctly references Self::should_hold_inbound_htlc.
  • lightning/src/ln/channelmanager.rs:7504False positive in prior review: there is only one "arrived" in the text. No double word.
  • lightning/src/ln/channel.rs:8171Still applicable: "HTlC" should be "HTLC" (lowercase 'l' in doc comment).
  • lightning/src/ln/channelmanager.rs:7534Resolved: when should_hold_inbound_htlc returns false (due to either Released or Signaled clearing the channel flag), the code correctly falls through to else if intercept_forward at line 7534, preserving user-level interception behavior.

Architecture assessment

The three-stage release in release_pending_inbound_held_htlc correctly covers the race window:

  1. monitor_pending_update_adds (clone exists after Committed promotion, pre-drain): clearing here affects the clone that enters decode_update_add_htlcsReleased.
  2. Pre-Committed states (RemoteAnnounced/AwaitingRemoteRevokeToAnnounce/AwaitingAnnouncedRemoteRevoke): HTLC hasn't been cloned. Clearing affects the clone taken at promotion → Released.
  3. Committed/WithOnion (post-clone): clone already taken. Channel's copy cleared as signal for should_hold_inbound_htlcSignaled.

The should_hold_inbound_htlc callback at line 7508 correctly detects the Signaled case by re-checking the channel's copy under peer_state lock. Lock ordering (peer_state → decode_update_add_htlcs → pending_intercepted_htlcs) is consistent between handle_release_held_htlc and process_pending_update_add_htlcs. The test accurately reproduces the pre-promotion race.

Copy link
Copy Markdown
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good, CI is just failing due to some broken doc links AFAICT.

Also validated this still works in conjunction with #4463 / for lightningdevkit/ldk-node#817.

state: InboundHTLCState,
}

/// Result of [`FundedChannel::release_pending_inbound_held_htlc`].
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding some more rationale why this is even necessary? I.e. describing the race this fixes?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated most of the docs, lmk your thoughts.

LocalHTLCFailureReason::TemporaryNodeFailure
);
},
// Check whether a ReleaseHeldHtlc arrived while the HTLC was in transit from channel
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also might be worth adding some more context here.

@valentinewallace valentinewallace force-pushed the 2026-04-async-held-htlc-ordering-v2 branch 2 times, most recently from ae776b9 to bd632fb Compare April 9, 2026 18:10
@valentinewallace
Copy link
Copy Markdown
Contributor Author

Addressed feedback with diff (plus an extra push for a whitespace fix).

tnull
tnull previously approved these changes Apr 9, 2026
Copy link
Copy Markdown
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just some doc nits.

Feel free to land as is though, assuming CI passes. (mod Claude's comments, maybe)

pub(super) enum HeldHtlcReleaseResult {
/// `hold_htlc` was cleared. The release of the HTLC is fully handled.
Released,
/// The ChannelManager has already pulled a copy of this HTLC out of the Channel for processing,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
/// The ChannelManager has already pulled a copy of this HTLC out of the Channel for processing,
/// The `ChannelManager` has already pulled a copy of this HTLC out of the `Channel` for processing,

Released,
/// The ChannelManager has already pulled a copy of this HTLC out of the Channel for processing,
/// but may not yet be persisting the HTLC in its internal state. This variant indicates that
/// we've cleared the `hold_htlc` flag in the Channel's cached version of the HTLC, which the
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// we've cleared the `hold_htlc` flag in the Channel's cached version of the HTLC, which the
/// we've cleared the `hold_htlc` flag in the `Channel`'s cached version of the HTLC, which the

/// The ChannelManager has already pulled a copy of this HTLC out of the Channel for processing,
/// but may not yet be persisting the HTLC in its internal state. This variant indicates that
/// we've cleared the `hold_htlc` flag in the Channel's cached version of the HTLC, which the
/// manager will re-check when it goes to actually forward in process_pending_htlc_forwards.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// manager will re-check when it goes to actually forward in process_pending_htlc_forwards.
/// manager will re-check when it goes to actually forward in `process_pending_htlc_forwards`.

HeldHtlcReleaseResult::NotFound
}

/// Returns whether an inbound HTlC has the `hold_htlc` flag set. Useful if the Channel's copy of
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Returns whether an inbound HTlC has the `hold_htlc` flag set. Useful if the Channel's copy of
/// Returns whether an inbound HTLC has the `hold_htlc` flag set. Useful if the `Channel`'s copy of

}

/// Returns whether an inbound HTlC has the `hold_htlc` flag set. Useful if the Channel's copy of
/// an HTLC was updated to be released but the ChannelManager's copy of the HTLC wasn't.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// an HTLC was updated to be released but the ChannelManager's copy of the HTLC wasn't.
/// an HTLC was updated to be released but the `ChannelManager`'s copy of the HTLC wasn't.

Handle `ReleaseHeldHtlc` messages that arrive before the sender-side LSP
has even queued the held HTLC for onion decoding. Unlike lightningdevkit#4106, which
covers releases arriving after the HTLC is in `decode_update_add_htlcs`
but before it reaches `pending_intercepted_htlcs`, this preserves
releases that arrive one step earlier and would otherwise be dropped as
HTLC not found.

Co-Authored-By: HAL 9000
Co-Authored-By: Elias Rohrer <dev@tnull.de>
@valentinewallace valentinewallace force-pushed the 2026-04-async-held-htlc-ordering-v2 branch from bd632fb to f764872 Compare April 9, 2026 18:15
@tnull tnull requested a review from TheBlueMatt April 9, 2026 18:18
HeldHtlcReleaseResult::NotFound
}

/// Returns whether an inbound HTlC has the `hold_htlc` flag set. Useful if the Channel's copy of
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "HTlC" → "HTLC" (lowercase 'l').

Suggested change
/// Returns whether an inbound HTlC has the `hold_htlc` flag set. Useful if the Channel's copy of
/// Returns whether an inbound HTLC has the `hold_htlc` flag set. Useful if the Channel's copy of

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 9, 2026

Codecov Report

❌ Patch coverage is 76.19048% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.00%. Comparing base (5704e8e) to head (f764872).
⚠️ Report is 12 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/ln/channel.rs 60.00% 13 Missing and 1 partial ⚠️
lightning/src/ln/channelmanager.rs 96.42% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #4548   +/-   ##
=======================================
  Coverage   86.99%   87.00%           
=======================================
  Files         163      163           
  Lines      108635   108720   +85     
  Branches   108635   108720   +85     
=======================================
+ Hits        94511    94587   +76     
- Misses      11647    11650    +3     
- Partials     2477     2483    +6     
Flag Coverage Δ
fuzzing 40.17% <4.76%> (-0.14%) ⬇️
tests 86.09% <76.19%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

);
if pending_add.forward_info.routing.should_hold_htlc() {
let hold_htlc = if pending_add.forward_info.routing.should_hold_htlc() {
// Check whether a ReleaseHeldHtlc arrived after the manager pulled a copy of this
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feel like it should be fixed by taking an additional lock in handle_release_held_htlc rather than checking for it here?

Copy link
Copy Markdown
Contributor Author

@valentinewallace valentinewallace Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe so -- did you see this race condition? #4542 (comment) Maybe we don't support that kind of multithreaded usage anyway?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, if we try to acquire the decode_update_add_htlcs lock prior to taking the channel lock, which I think would fix a different race you may be pointing out here, it currently results in a lock order violation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

weekly goal Someone wants to land this this week

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

5 participants