Skip to content

Expand user guide documentation and storage example with REPL#288

Merged
meling merged 48 commits intomasterfrom
meling/285/doc-update-symmetric-configurations
Mar 14, 2026
Merged

Expand user guide documentation and storage example with REPL#288
meling merged 48 commits intomasterfrom
meling/285/doc-update-symmetric-configurations

Conversation

@meling
Copy link
Member

@meling meling commented Mar 12, 2026

Documentation and Expanded Storage Example

Significantly expands the user guide and developer guide documentation, and upgrades the examples/storage example to cover all call types — including unicast, multicast, correctable reads, and nested (symmetric) quorum calls using ServerCtx.Config(). Adds NewLocalSystems to the core library for pre-allocating local test clusters, and removes the legacy gorumsexample directory in favor of the richer storage example.

Changes

Documentation (doc/)

  • user-guide.md:
    • Expand the call types table to include synchronous quorum calls and correctable streaming calls.
    • Add a Mermaid sequence diagram illustrating multicast, quorum call, and correctable call flows.
    • Separate node-context vs. config-context generated functions into distinct code blocks with explanatory prose.
    • Clarify that unicast and multicast are one-way — no server-side result is returned.
    • Clarify that the handler context is derived from the gRPC stream context and extended with per-request metadata.
    • Add a Server-Side Interceptors section with code examples for logging, delay, metadata injection, and request filtering interceptors.
    • Add an Interactive REPL section documenting all REPL commands, flags, and an example session.
    • Add a Nested Quorum Calls with ServerCtx.Config section, covering peer tracking setup, ctx.Release() usage, handler code, and a sequence diagram.
    • Add a Reverse Direction Calls with ServerCtx.ClientConfig section, covering firewall use cases, client setup, handler code, and a sequence diagram.
    • Update proto package references from gorumsexample to storage/proto.
  • dev-guide.md:
    • Clarify what counts as "static code" in dev/ (non-zorums_ .go files excluding test files).
    • Expand the Makefile targets table with all current targets.

Storage example (examples/storage/)

  • proto/storage.proto — Add call types:
    • WriteUnicast (unicast)
    • WriteMulticast (multicast)
    • ReadNestedQC (quorum call with server-side fan-out via ServerCtx.Config())
    • WriteNestedMulticast (quorum call wrapping a nested multicast)
    • ReadCorrectable (streaming quorum call)
  • server.go:
    • Implement WriteUnicast, ReadCorrectable, ReadNestedQC, and WriteNestedMulticast handlers.
    • Add rawWriter helper that translates \n to \r\n for correct output in raw terminal mode.
    • Replace startServer helper with gorums.NewSystem-based setup.
    • Remove hard-coded interceptors from the server; interceptors are now configured via CLI flag.
    • Update newStorageServer to accept io.Writer and a label for logging.
  • main.go:
    • Replace manual server loop with gorums.NewLocalSystems for local cluster setup.
    • Add --interceptors flag supporting logging, nofoo, metadata, delayed.
    • Call sys.NewOutboundConfig (no explicit node list needed — stored by NewLocalSystems).
  • repl.go:
    • Add ucast / unicast command for one-way unicast writes.
    • Add cread sub-command for correctable reads (streams progressive updates).
    • Add nread and nwrite sub-commands for nested quorum call / nested multicast.
    • Replace github.com/google/shlex with a self-contained splitQuoted implementation (removes external dependency).
    • Refactor parseConfiguration into parseConfiguration + parseIndices (error-returning, no implicit fmt.Println).
    • Fix cfg range syntax to allow stop == numNodes (exclusive end).
    • Add delayOutput constant for short post-call delay so server log lines appear before the prompt.
    • Remove stale FIXME comment about arrow keys on Windows.

Core library

  • system.go:
    • Add NewLocalSystems(n int, opts ...ServerOption) ([]*System, func(), error) — pre-allocates all listeners before creating any system (solves the chicken-and-egg address problem), configures each system with WithConfig (node IDs 1..n), and returns a stop closure.
    • Add nodeList field to System to store the node list set by NewLocalSystems.
    • Update NewOutboundConfig to use the stored node list automatically, simplifying call sites.

Testing

All existing tests continue to pass. The storage example builds and runs end-to-end with the new REPL commands and local cluster setup. Storage example has been tested manually.

Related

Closes #285

meling added 26 commits March 8, 2026 17:55
This replaces the gorumsexample with storageexample and removes the
obsolete tutorial proto.
- Remove auto-serve from NewLocalSystems; callers must now call
  sys.Serve() after registering services, fixing a racy
  RegisterHandler-after-Serve pattern.
- Remove dead server functions (startServer2, startServerOnListener,
  startServer) and simplify runServer to use NewSystem directly.
- Move terminal creation into newRepl(), removing the awkward
  NewTerminal inline struct from main() and the *term.Terminal
  parameter threading.
- Add -interceptors CLI flag for comma-separated interceptor
  selection at startup (logging, nofoo, metadata, delayed).
- Add doc/issue-dynamic-interceptors.md for future consideration
  of runtime interceptor toggling.
To avoids attempting to call the local node in nested calls.
This should prevent most issues where servers emit output after the
client finishes a nested quorum call, multicast, or unicast calls that
does not wait for a response.
Previously, creating a sub-configuration using cfg commands
in the storage repl (e.g., cfg 0,2) failed with an "address
already in use by node" error. This could happened because
parseConfiguration used gorums.WithNodeList with the addresses
of the selected nodes. WithNodeList is designed to instantiate
new nodes with automatically generated IDs, which caused a
conflict since those addresses were already registered in the
manager's node registry under different IDs.

This commit resolves the issue by directly extracting the
existing pb.Node instances from the manager, sorting them by
ID (as NewConfiguration normally would), and casting the
resulting slice to pb.Configuration. This correctly reuses the
existing nodes and their identifiers without attempting to
register duplicate connections.
This replaces the "archived" shlex.Split dependency with a simple
splitQuoted custom implementation that supports what we need for our
simple repl implementation, namely quoted arguments. We do not need full
Posix command line parsing supported by shlex and replacement libraries.

This also updates the dependencies and the go version since we are
already touching go.mod.
I believe this has been fixed in 2024 with this commit:
golang/term@f867b76
Add a new section explaining how to use the interactive REPL tool
included in examples/storage. The section covers how to start the
storage example (both locally and against remote servers), lists every
supported command in a reference table, and includes an example session
showing reads, writes, unicast/multicast, correctable, and nested
quorum-call operations. A forward reference to the nested-QC and
ClientConfig sections is included. These sections will be added in
separate follow up commits.
Add a new section explaining the nested quorum call pattern where a
server handler issues its own quorum calls to peer nodes using
ServerCtx.Config(). The section covers how to enable peer tracking with
WithConfig, the importance of calling ctx.Release() before outbound
nested calls to avoid blocking inbound message processing, handler
examples for both ReadNestedQC and WriteNestedMulticast, and a sequence
diagram illustrating the client-to-server-to-peers call flow.
This covers the addition of the new 'Reverse Direction Calls with
ServerCtx.ClientConfig' section, which documents the firewall use
case, WithClientConfig() server setup, anonymous client registration
via Register, the handler pattern using ctx.ClientConfig(), the
comparison table, and the sequence diagram.
Remove the peerCfg field from storageServer and the manual self-exclusion
loop in main.go. NewOutboundConfig now uses the node list stored by
NewLocalSystems, so callers no longer need to filter out their own address
or pass an explicit node list. ReadNestedQC and WriteNestedMulticast always
use ctx.Config() directly. System.NewOutboundConfig is updated to compose
the base options (stored node list and optional request handler) before any
caller-supplied options.
@deepsource-io
Copy link
Contributor

deepsource-io bot commented Mar 12, 2026

DeepSource Code Review

We reviewed changes in 08d70f0...9ec7ddd on this pull request. Below is the summary for the review, and you can see the individual issues we found as inline review comments.

See full review on DeepSource ↗

PR Report Card

Overall Grade   Security  

Reliability  

Complexity  

Hygiene  

Code Review Summary

Analyzer Status Updated (UTC) Details
Go Mar 14, 2026 3:54p.m. Review ↗
Shell Mar 14, 2026 3:54p.m. Review ↗

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates Gorums documentation and examples to better cover all call types (unicast, multicast, quorum calls, correctable streaming) and symmetric/nested call patterns, while adding a core helper (NewLocalSystems) to simplify local multi-node setups and outbound configuration.

Changes:

  • Expand doc/user-guide.md and doc/dev-guide.md with updated guidance, diagrams, interceptor docs, and REPL usage.
  • Upgrade examples/storage to cover additional call types (including correctable + nested calls) and extend the REPL accordingly; remove the legacy examples/gorumsexample proto.
  • Add gorums.NewLocalSystems and update System.NewOutboundConfig to reuse a stored node list when available.

Reviewed changes

Copilot reviewed 16 out of 19 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
system.go Adds NewLocalSystems, stores a node list on System, and changes NewOutboundConfig defaults.
go.work Bumps workspace Go version to 1.26.0.
go.sum Updates dependency checksums after module/version bumps.
go.mod Bumps module Go version to 1.26.0 and updates dependencies.
examples/storage/server.go Refactors server startup via gorums.NewSystem and adds new handlers (unicast, multicast, correctable, nested calls).
examples/storage/repl.go Adds new REPL commands and replaces shlex with a local quoted-splitting implementation.
examples/storage/proto/storage_gorums.pb.go Regenerated Gorums bindings for new call types.
examples/storage/proto/storage.proto Extends the Storage service with unicast, multicast, nested quorum calls, and correctable streaming quorum call.
examples/storage/proto/storage.pb.go Regenerated protobuf bindings for updated storage proto.
examples/storage/main.go Uses gorums.NewLocalSystems, adds interceptor flag parsing, and sets up local systems.
examples/storage/client.go Minor cleanup while keeping configuration creation for the REPL client.
examples/gorumsexample/storage.proto Removes legacy tutorial proto (directory deprecation).
examples/go.sum Updates example-module checksums (including removal of google/shlex).
examples/go.mod Bumps example module Go version to 1.26.0 and updates deps.
doc/user-guide.md Major expansion: call types, diagrams, interceptors, REPL docs, nested call patterns.
doc/dev-guide.md Clarifies “static code” definition in dev/ and expands Makefile targets table.
AGENTS.md Adds conventional-commit requirements; removes a design-doc reference.
.vscode/gorums.txt Adds new words/commands to workspace dictionary.
.gitignore Removes ignores for legacy tutorial generated files (since tutorial example is removed).
Comments suppressed due to low confidence (1)

doc/user-guide.md:123

  • This section says the runnable example is examples/storage, but the sequence diagram and proto snippet still use method names like ReadOrdered/ReadQCStream that don’t exist in examples/storage/proto/storage.proto (which uses ReadRPC/WriteRPC/ReadCorrectable, etc.). Please update the diagram/snippet to match the actual storage example API, or clarify that it’s a separate tutorial proto.
    Note over Client,Node B: Correctable (Streaming)
    Client->>Node A: ReadQCStream(Req)
    Client->>Node B: ReadQCStream(Req)
    Node A-->>Client: ReadResponse(v1, t1)
    Node B-->>Client: ReadResponse(v1, t1)
    Note over Client: Client reads first quorum result
    Node A-->>Client: ReadResponse(v2, t2)
    Node B-->>Client: ReadResponse(v2, t2)
    Note over Client: Client reads updated quorum result

Configuring Call Types in Protobuf

The file storage.proto should have the following content, illustrating the different call types:

edition = "2024";

package proto;
option go_package = "github.com/relab/gorums/examples/storage/proto";
option features.field_presence = IMPLICIT;

import "google/protobuf/empty.proto";
import "gorums.proto";

// Storage service defines the RPCs for a simple key-value storage system.
service Storage {
  // ReadOrdered is a FIFO-ordered RPC to a single node.
  rpc ReadOrdered(google.protobuf.Empty) returns (State) {}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

meling added 6 commits March 12, 2026 13:39
Add isOption() to ServerOption so it satisfies the Option interface,
allowing NewSystem and NewLocalSystems to accept a single ...Option
variadic instead of ...ServerOption.

NewLocalSystems auto-creates an outbound Configuration per system
when ManagerOptions (e.g. dial options) are provided, accessible via
System.OutboundConfig(). NewSystem does the same when a NodeListOption
is present. The internal createOutboundConfig helper combines the node
list, self-routing request handler, and manager options.

NewOutboundConfig is retained for asymmetric setups where the node list
differs per system; it no longer requires a stored nodeList field, which
is removed from the System struct.

NewConfig now returns a clear error when a ServerOption is passed to it
by mistake instead of the generic unknown-type error.

TestSystems is simplified to wrap NewLocalSystems directly, removing the
callback-based setup and duplicate listener pre-allocation. All symmetric
test helpers in system_test.go are updated to use sys.OutboundConfig().
The storage example is simplified to pass dial options directly to
NewLocalSystems.
- remove WithClientConfig and always initialize inbound peer management
- track only peer-capable clients (gorums-node-id: 0) as dynamic peers
- keep regular clients without gorums-node-id out of ClientConfig
- add Server.NewConfig to create outbound configs with back-channel
  handler support
- update inbound manager logic and tests for revised client/peer behavior
Extract splitOptions to DRY up option parsing in NewSystem and
NewLocalSystems, and guard against typed-nil options via
reflect.ValueOf.IsNil.

Rename createOutboundConfig to newOutboundConfig with variadic
ManagerOption signature; always include withRequestHandler
unconditionally rather than conditionally on inboundManager.

Reject NodeListOption in NewLocalSystems and always create the
outbound Configuration. Extract allocateListeners as a standalone
helper function.

Remove the NewOutboundConfig public method; asymmetric setups
should use Server.NewConfig directly instead.

Rename system_test.go tests for clarity. Simplify
createClientServerSystems to use a plain *Server as the client
instead of a full System with its own listener.

Add opts_test.go with table-driven tests covering typed-nil
handling in splitOptions.
A client must explicitly announce NodeID=0 in its connection metadata
to be assigned a dynamic node ID by the server and to appear in
ClientConfig() for reverse-direction calls.

Update the "Connecting as an Anonymous Client" section to:
- Lead with the NodeID=0 announcement as the enabling mechanism
- Replace the stale myID reference with the current node ID wording
- Link the announcement explicitly to enabling reverse-direction calls
- Note that clientSrv.NewConfig without WithConfig sends NodeID=0 automatically
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 24 out of 27 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

doc/user-guide.md:123

  • The guide states that the runnable example and proto definition live in examples/storage/proto/storage.proto, but the proto snippet and diagrams below still use the older API shape (e.g., ReadOrdered/ReadQCStream/State) which does not match examples/storage/proto/storage.proto in this PR (ReadRPC/WriteRPC/ReadCorrectable/etc.). Please either update the snippet/diagrams to match the actual storage example, or clarify that this section uses a simplified, separate proto that is not the same as the runnable example.
The runnable version of this example lives in `examples/storage`.
The protobuf definition for that example lives in `examples/storage/proto/storage.proto`.
If you are starting from scratch, you can still create your own module and use the same proto structure.

```shell
mkdir $HOME/storageexample
cd $HOME/storageexample
go mod init storageexample
go get github.com/relab/gorums

Call Types

Gorums offers several call types including quorum calls, and one-way unicast and multicast communication.
To select a call type for a Protobuf service method, specify one of the following options (they cannot be combined):

Call type Gorums option Description
Ordered RPC no option FIFO-ordered synchronous RPC to a single node.
Unicast gorums.unicast FIFO-ordered one-way asynchronous unicast.
Multicast gorums.multicast FIFO-ordered one-way asynchronous multicast.
Quorum Call gorums.quorumcall FIFO-ordered asynchronous quorum call on a configuration of nodes.
Synchronous Quorum Call gorums.quorumcall Synchronous variant; chain a terminal method (.First(), .Majority(), .All(), .Threshold(n)) to block until the quorum threshold is met.
Correctable gorums.quorumcall + stream Streaming quorum call; use .Correctable(n) to observe progressive updates as nodes respond.

The generated API is similar to unary gRPC, unless the stream keyword is used in the proto definition.
Server streaming is only supported for quorum calls.
Client streaming is not supported.

sequenceDiagram
    participant Client
    participant Node A
    participant Node B

    Note over Client,Node B: Multicast (One-way)
    Client->>Node A: WriteMulticast(Req)
    Client->>Node B: WriteMulticast(Req)
    Note over Client: Client continues immediately

    Note over Client,Node B: Quorum Call (Synchronous)
    Client->>Node A: ReadQC(Req)
    Client->>Node B: ReadQC(Req)
    Node A-->>Client: ReadResponse(Value, Time)
    Node B-->>Client: ReadResponse(Value, Time)
    Note over Client: Client aggregates responses

    Note over Client,Node B: Correctable (Streaming)
    Client->>Node A: ReadQCStream(Req)
    Client->>Node B: ReadQCStream(Req)
    Node A-->>Client: ReadResponse(v1, t1)
    Node B-->>Client: ReadResponse(v1, t1)
    Note over Client: Client reads first quorum result
    Node A-->>Client: ReadResponse(v2, t2)
    Node B-->>Client: ReadResponse(v2, t2)
    Note over Client: Client reads updated quorum result
Loading

Configuring Call Types in Protobuf

The file storage.proto should have the following content, illustrating the different call types:

edition = "2024";

package proto;
option go_package = "github.com/relab/gorums/examples/storage/proto";
option features.field_presence = IMPLICIT;

import "google/protobuf/empty.proto";
import "gorums.proto";

// Storage service defines the RPCs for a simple key-value storage system.
service Storage {
  // ReadOrdered is a FIFO-ordered RPC to a single node.
  rpc ReadOrdered(google.protobuf.Empty) returns (State) {}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

meling added 9 commits March 14, 2026 10:11
System.Stop relied solely on grpc.Server.Stop to close the listener,
but gRPC only closes listeners registered via Serve. If Stop was called
before Serve (e.g. in the NewLocalSystems error-unwind path or via the
returned stop closure), the pre-allocated net.Listener leaked a file
descriptor. Fix by always calling s.lis.Close() in System.Stop after
s.srv.Stop; a harmless error is discarded when gRPC already closed it.

Add tests to verify the listener is released in both scenarios.
Allow interceptors and other options to be passed to runServer.
We removed the clientPeers argument in an earlier commit; this fixes the
documentation accordingly.
Update method names, message types, and code snippets throughout
the user guide to match examples/storage/proto/storage.proto.
Replace ReadOrdered/ReadQCStream/State with ReadRPC/ReadCorrectable/
ReadRequest/ReadResponse/WriteRequest. Add a note that the proto
snippet is a partial representation showing one RPC per call type.
Use AsTime().After() for timestamp comparisons to match the API
used in the actual client and server code.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 28 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@meling meling merged commit da14cbe into master Mar 14, 2026
5 checks passed
@meling meling deleted the meling/285/doc-update-symmetric-configurations branch March 14, 2026 16:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

doc: update documentation to explain how to use symmetric configurations

2 participants