Skip to content

Implement lean_attestation_committee_subnet metric#243

Merged
pablodeymo merged 4 commits intomainfrom
add-attestation-committee-subnet-metric
Mar 18, 2026
Merged

Implement lean_attestation_committee_subnet metric#243
pablodeymo merged 4 commits intomainfrom
add-attestation-committee-subnet-metric

Conversation

@pablodeymo
Copy link
Collaborator

Motivation

The leanMetrics specification defines lean_attestation_committee_subnet as a Network Metric (Gauge, set on node start) that all lean clients should expose. ethlambda already computes the subnet ID at startup (validator_id % attestation_committee_count) for gossipsub topic subscription, but does not expose it as a Prometheus gauge.

This brings ethlambda closer to full leanMetrics compliance and helps operators verify subnet assignments via Grafana dashboards.

Description

Register a new lean_attestation_committee_subnet Prometheus IntGauge in the P2P metrics module and set it at startup with the computed subnet ID.

Changes

  • crates/net/p2p/src/metrics.rs: New set_attestation_committee_subnet(subnet_id: u64) function with a function-scoped LazyLock<IntGauge>, following the same pattern used by set_attestation_committee_count in the blockchain metrics module.

  • crates/net/p2p/src/lib.rs: Call the setter after computing subnet_id (line 213), passing subnet_id.unwrap_or(0). Non-validator nodes default to subnet 0, which matches their fanout publishing behavior.

  • docs/metrics.md: Added row to the Network Metrics table documenting the new metric as supported.

Design decisions

  • Placed in P2P metrics (not blockchain metrics): The subnet is a network-layer concept — it governs gossipsub topic subscription and is computed within the P2P module.
  • Non-validator nodes expose subnet 0: The leanMetrics spec says "on node start" with no validator-only qualifier. Non-validators use subnet 0 for fanout publishing, so exposing 0 is accurate.

How to Test

make fmt   # passes
make lint  # passes
make test  # all tests pass

To verify at runtime: start a node and check curl http://localhost:5054/metrics | grep lean_attestation_committee_subnet.

Register and set the metric at P2P startup with the computed subnet ID
(validator_id % attestation_committee_count), per the leanMetrics spec.
Non-validator nodes default to subnet 0.
@github-actions
Copy link

🤖 Kimi Code Review

Review Summary

The PR adds a metric to expose the node's attestation committee subnet ID. Overall, the implementation is correct but has a few minor issues.

Issues Found

  1. Potential panic on subnet ID conversion (crates/net/p2p/src/metrics.rs:79)

    • The code uses unwrap() on subnet_id.try_into() which could panic if subnet_id exceeds i64::MAX (9,223,372,036,854,775,807)
    • While subnet IDs are unlikely to reach this value in practice, consensus code should be defensive
    • Fix: Replace with proper error handling:
    LEAN_ATTESTATION_COMMITTEE_SUBNET.set(
        subnet_id.try_into()
            .unwrap_or_else(|_| i64::MAX) // or handle appropriately
    );
  2. Metric initialization location (crates/net/p2p/src/metrics.rs:72-78)

    • The LEAN_ATTESTATION_COMMITTEE_SUBNET gauge is re-initialized on every call to set_attestation_committee_subnet
    • While LazyLock ensures it's only registered once, this pattern is less efficient than initializing all metrics at module level
    • Fix: Move the gauge declaration to module level alongside other metrics
  3. Documentation inconsistency (docs/metrics.md:77)

    • The documentation states the metric is collected "On node start" but it's actually set during swarm building, which may happen multiple times
    • Fix: Change to "On swarm initialization" or similar

Minor Observations

  • The metric name lean_attestation_committee_subnet is clear and follows the established naming convention
  • The use of subnet_id.unwrap_or(0) for non-validators is consistent with the existing topic selection logic
  • The implementation correctly handles both validator and non-validator cases

Automated review by Kimi (Moonshot AI) · custom prompt

@github-actions
Copy link

🤖 Codex Code Review

  1. Potential startup panic from unchecked integer conversion
    At crates/net/p2p/src/metrics.rs:80, subnet_id.try_into().unwrap() can panic if subnet_id > i64::MAX (Prometheus IntGauge::set takes i64).
    Suggested change: use checked conversion and clamp or return early/log on overflow, e.g. if let Ok(v) = i64::try_from(subnet_id) { set(v) } else { ... }.

  2. Metric semantics are ambiguous for non-validators
    At crates/net/p2p/src/lib.rs:213, non-validator nodes publish lean_attestation_committee_subnet=0, which is indistinguishable from real subnet 0 validators.
    Suggested change: represent “not a validator” explicitly (for example -1 in IntGauge, or a separate lean_is_validator gauge) to avoid misleading monitoring/alerting.

No consensus-critical logic (fork choice, attestation validation, justification/finalization, state transition, XMSS, SSZ) is modified in this PR. Performance impact is negligible.


Automated review by OpenAI Codex · custom prompt

@github-actions
Copy link

🤖 Claude Code Review

Now I have everything I need for the review.


PR Review: Implement lean_attestation_committee_subnet metric

Overall: Small, focused change that correctly follows existing patterns. One notable inconsistency with the sibling function that warrants a fix.


crates/net/p2p/src/metrics.rsset_attestation_committee_subnet

Panic on overflow (minor bug)

LEAN_ATTESTATION_COMMITTEE_SUBNET.set(subnet_id.try_into().unwrap());

IntGauge::set takes i64, and try_into::<i64>() on a u64 will fail (panic) if subnet_id > i64::MAX. Compare this to the sibling function in blockchain/src/metrics.rs:371:

LEAN_ATTESTATION_COMMITTEE_COUNT.set(count.try_into().unwrap_or_default());

That one uses .unwrap_or_default() to degrade gracefully instead of panicking. The new code should match:

LEAN_ATTESTATION_COMMITTEE_SUBNET.set(subnet_id.try_into().unwrap_or_default());

In practice subnet_id = validator_id % attestation_committee_count will always be small, but panicking on a metrics write would be an embarrassing crash. Consistency with the existing pattern also matters.


crates/net/p2p/src/lib.rs — call site

metrics::set_attestation_committee_subnet(subnet_id.unwrap_or(0));

The unwrap_or(0) duplicates the logic already expressed in the match arm below (None => format!("{ATTESTATION_SUBNET_TOPIC_PREFIX}_0")). This is fine as-is since the PR description explains the reasoning, but a brief inline comment like the one on that None arm would help future readers understand why 0 is correct for non-validators, rather than it looking like an arbitrary default.


docs/metrics.md

Row placement and formatting are consistent with surrounding entries. No issues.


Summary

The only actionable change: swap .unwrap() for .unwrap_or_default() in metrics.rs to match the established pattern in blockchain/src/metrics.rs:371 and avoid a theoretical panic. Everything else looks good.


Automated review by Claude (Anthropic) · custom prompt

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 18, 2026

Greptile Summary

This PR exposes the pre-computed attestation committee subnet ID as a Prometheus IntGauge (lean_attestation_committee_subnet), bringing ethlambda closer to full leanMetrics compliance. The subnet ID was already calculated at startup in build_swarm for gossipsub topic subscription; the change simply plumbs it into a new metric setter and documents it.

  • crates/net/p2p/src/metrics.rs: New set_attestation_committee_subnet(subnet_id: u64) using a function-scoped LazyLock<IntGauge>; the .try_into().unwrap_or_default() conversion is safe and consistent with the cited blockchain metrics pattern.
  • crates/net/p2p/src/lib.rs: Metric is set immediately after subnet_id is computed, using unwrap_or(0) to default non-validator nodes to subnet 0 — consistent with the fanout publishing logic that follows.
  • docs/metrics.md: New row correctly added to the Network Metrics table.
  • Minor style inconsistency: the new static is function-scoped while every other metric in metrics.rs uses a module-level LazyLock static.

Confidence Score: 5/5

  • This PR is safe to merge — it adds a read-only gauge metric with no impact on consensus or network behaviour.
  • The change is narrow and low-risk: it reads an already-computed value and records it in a Prometheus gauge. The conversion is safe (try_into().unwrap_or_default()), the default for non-validators (0) matches existing behaviour, and no existing logic is modified. The only finding is a minor style inconsistency in static scoping.
  • No files require special attention.

Important Files Changed

Filename Overview
crates/net/p2p/src/metrics.rs Adds set_attestation_committee_subnet with a function-scoped LazyLock<IntGauge>; logic is correct but the scoping pattern is inconsistent with the module-level statics used for every other metric in this file.
crates/net/p2p/src/lib.rs Calls set_attestation_committee_subnet(subnet_id.unwrap_or(0)) immediately after computing the subnet ID; placement and default value (0 for non-validators) are correct and match the topic-subscription logic below.
docs/metrics.md Adds the lean_attestation_committee_subnet row to the Network Metrics table with correct type, usage description, and supported status.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant build_swarm (lib.rs)
    participant metrics.rs
    participant Prometheus Registry

    Caller->>build_swarm (lib.rs): build_swarm(config)
    build_swarm (lib.rs)->>build_swarm (lib.rs): subnet_id = config.validator_id.map(|vid| vid % attestation_committee_count)
    build_swarm (lib.rs)->>metrics.rs: set_attestation_committee_subnet(subnet_id.unwrap_or(0))
    metrics.rs->>Prometheus Registry: LazyLock::new → register_int_gauge!("lean_attestation_committee_subnet") [once]
    metrics.rs->>Prometheus Registry: LEAN_ATTESTATION_COMMITTEE_SUBNET.set(subnet_id as i64)
    build_swarm (lib.rs)->>build_swarm (lib.rs): subscribe to attestation topic (validators only)
    build_swarm (lib.rs)->>Caller: swarm
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/net/p2p/src/metrics.rs
Line: 72-81

Comment:
**Function-scoped static inconsistent with file pattern**

The three existing metrics in this file (`LEAN_CONNECTED_PEERS`, `LEAN_PEER_CONNECTION_EVENTS_TOTAL`, `LEAN_PEER_DISCONNECTION_EVENTS_TOTAL`) all use module-level `LazyLock` statics, while this new gauge uses a function-scoped static. Both are valid Rust, but mixing them in the same file creates an inconsistency.

The PR description notes this follows the blockchain metrics module pattern, but it would be more consistent with the rest of `metrics.rs` to hoist the static to module scope:

```rust
static LEAN_ATTESTATION_COMMITTEE_SUBNET: LazyLock<IntGauge> = LazyLock::new(|| {
    register_int_gauge!(
        "lean_attestation_committee_subnet",
        "Node's attestation committee subnet"
    )
    .unwrap()
});

/// Set the attestation committee subnet gauge.
pub fn set_attestation_committee_subnet(subnet_id: u64) {
    LEAN_ATTESTATION_COMMITTEE_SUBNET.set(subnet_id.try_into().unwrap_or_default());
}
```

This aligns with how every other metric in this file is structured.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "Update crates/net/p2..."

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
@pablodeymo
Copy link
Collaborator Author

@greptile-apps please review it again

Copy link
Collaborator

@MegaRedHand MegaRedHand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +210 to +213
let subnet_id = config
.validator_id
.map(|vid| vid % config.attestation_committee_count);
metrics::set_attestation_committee_subnet(subnet_id.unwrap_or(0));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should put this inside the config.is_aggregator if.

Comment on lines 215 to 219
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be simplified to subnet_id.unwrap_or(0), like in the metric call

simplify subnet_id match, and use unwrap_or_default for try_into
…gators

The leanMetrics spec defines this as "on node start" without a validator-only
qualifier. Non-validators use subnet 0 for fanout publishing, so exposing 0
is accurate.
@pablodeymo pablodeymo merged commit c6060d5 into main Mar 18, 2026
2 checks passed
@pablodeymo pablodeymo deleted the add-attestation-committee-subnet-metric branch March 18, 2026 21:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants