feat(telemetry): Trace finding lifecycle snapshots#249
Open
Conversation
Record flattened finding payloads across the hunk, report, and review stages so traces preserve the full finding lifecycle instead of only aggregate counts. Refactor skill finalization into a shared path so CLI, action, schedule, and eval execution emit the same report-stage telemetry, and document the Warden-specific span conventions in the telemetry spec. Co-Authored-By: GPT-5 <noreply@anthropic.com>
Remove the Sentry helper test and the telemetry-only assertions that mocked Sentry internals. Also drop the ad hoc skill.name and trigger.name attrs from the review-post span so the custom telemetry stays within the documented Warden namespaced fields. Co-Authored-By: GPT-5 <noreply@anthropic.com>
Extract the review posting stages into small helpers so the main path reads as a linear pipeline. Keep the review finding telemetry at the same lifecycle boundaries while preserving the existing early-return behavior around render results and success-only reviews. Co-Authored-By: GPT-5 <noreply@anthropic.com>
Keep review_posted telemetry aligned with the actual GitHub review outcome. Emit the posted finding snapshot only after the review call succeeds so failed post attempts do not appear as delivered in traces. Co-Authored-By: GPT-5 <noreply@anthropic.com>
Move finding-stage attribute shaping into a dedicated helper and keep Sentry\nfocused on span lifecycle and transport concerns. Trim snapshot payloads\nto core metadata so telemetry stops duplicating finding text and suggested\nfix content that already exists elsewhere in the trace and review output.\n\nCo-Authored-By: Codex GPT-5 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Treat re-rendered reviews as not posted when rendering produces no\nreview payload. This keeps review_posted telemetry and synthetic\ncomment tracking aligned with what was actually sent to GitHub.\n\nAdd a regression test for the no-op render path.\n\nCo-Authored-By: Codex GPT-5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Trace finding lifecycle snapshots in telemetry
Warden only emitted aggregate finding counts, which meant traces lost the actual findings and any findings pruned during deduplication or review filtering. This adds flattened indexed finding attributes across the hunk, report, and review stages so the trace shows the finding lifecycle instead of only the survivors.
The skill report finalization path is now shared between runSkill() and runSkillTask(), which keeps CLI, action, schedule, and eval execution aligned. The action review poster now records the filtered, consolidated, deduplicated, and actually posted finding sets on a trigger-local review span.
This also documents the Warden-specific warden.findings spans, stage names, and flat indexed attribute encoding in the telemetry spec so the custom behavior is explicit instead of living only in code.