-
-
Notifications
You must be signed in to change notification settings - Fork 98
Cloudflare Workers: sendActivity() fanout path does not await fanoutQueue.enqueue(), risking dropped fanout messages without waitUntil #661
Description
Summary
When running Fedify on Cloudflare Workers (module worker) with a Queue-backed MessageQueue (e.g. @fedify/cfworkers), the fanout branch of ContextImpl.sendActivityInternal() calls fanoutQueue.enqueue(...) without await-ing or returning the Promise.
On Cloudflare Workers, a Promise that is not awaited / returned / passed to ctx.waitUntil() is a floating promise, and Cloudflare explicitly warns that floating promises can lead to dropped results / unfinished work, because the runtime may terminate the isolate before the promise completes. This can make the fanout task enqueue itself unreliable.
In practice, this may appear as intermittent federation delivery gaps: the user receives a successful HTTP response, but the fanout message never reaches the queue, so no downstream delivery happens.
(Korean note / 한국어 요약: Cloudflare Workers에서는 await되지 않은 Promise가 유실될 수 있고, Fedify fanout 분기에서 fanoutQueue.enqueue()가 await되지 않아 fanout 메시지가 큐에 안 들어갈 가능성이 있습니다.)
Affected versions
@fedify/fedify: 2.1.3@fedify/cfworkers: 2.1.3- Runtime: Cloudflare Workers (ES module worker)
(Commit / tag details are unspecified from the reporter side.)
Code pointers (Fedify 2.1.3 published source)
File: src/federation/middleware.ts (JSR source for 2.1.3)
FANOUT_THRESHOLD = 5(around L1654)- Fanout decision logic checks inbox map size
< FANOUT_THRESHOLD(around L2317–L2323) - In fanout branch:
_startQueueInternal(...)is invoked (not awaited), thenfanoutQueue.enqueue(...)is called (not awaited) (around L2353–L2359)
Reference:
https://jsr.io/@fedify/fedify/2.1.3/src/federation/middleware.ts
Why this is problematic specifically on Cloudflare Workers
Cloudflare Workers best practices:
- “A Promise that is not awaited / returned / passed to ctx.waitUntil() is a floating promise”
- “Floating promises cause silent bugs: dropped results, swallowed errors, unfinished work”
- “The Workers runtime may terminate your isolate before a floating promise completes.”
https://developers.cloudflare.com/workers/best-practices/workers-best-practices/
Cloudflare ctx.waitUntil() docs:
ctx.waitUntil()extends the Worker’s lifetime for background work (non-blocking).- There is a “30-second after invocation end” limit (shared across waitUntil calls).
https://developers.cloudflare.com/workers/runtime-apis/context/
Why “wrapping sendActivity with ctx.waitUntil” is NOT sufficient today
If ctx.sendActivity(...) resolves before the internal fanoutQueue.enqueue(...) promise resolves, then:
ctx.waitUntil(ctx.sendActivity(...));only waits for the sendActivity Promise, which may already be resolved (because enqueue is not part of the promise chain). So waitUntil(sendActivityPromise) does not guarantee the enqueue completes.
Reproduction suggestions (minimal)
Layer 1: Cloudflare Workers (baseline floating promise repro)
Create a Worker with a Queue producer binding and a consumer Worker (or a single Worker with both fetch and queue handlers).
In fetch, call env.MY_QUEUE.send({ ... }) without await and immediately return new Response("OK").
Send many requests; compare number of requests vs messages consumed/logged by the queue handler.
Then fix by either await env.MY_QUEUE.send(...) or ctx.waitUntil(env.MY_QUEUE.send(...)); compare again.
Layer 2: Fedify fanout repro
Ensure fanout triggers (inbox map size >= 5).
Call ctx.sendActivity(...) from a request handler that returns a Response immediately.
Observe intermittent missing fanout tasks / missing downstream deliveries.
Proposed fixes (backward-compatible options)
Option A: include enqueue in the Promise chain
In the fanout branch of sendActivityInternal, either:
await this.federation.fanoutQueue.enqueue(message, { orderingKey: options.orderingKey }); OR
return await this.federation.fanoutQueue.enqueue(...);
This ensures:
await ctx.sendActivity(...) guarantees the fanout message is actually enqueued
Cloudflare users can safely do ctx.waitUntil(ctx.sendActivity(...)) to avoid response latency while still guaranteeing enqueue completion.
Option B: add an explicit option
Introduce an option (e.g. fanoutEnqueue: "await" | "detach") with docs explaining Cloudflare Workers expectations. Default is TBD.
Option C: Cloudflare-specific wrapper guidance (workaround)
Document a wrapper queue that registers every enqueue promise via ExecutionContext.waitUntil() (and optionally no-ops listen() for WorkersMessageQueue). This is a workaround; Option A is still the cleanest semantics for sendActivity().