This example demonstrates how to integrate OpenTelemetry metrics with the Pinecone Java SDK using the ResponseMetadataListener feature. It captures latency metrics for all data plane operations and exports them to Prometheus/Grafana for visualization.
- Captures client-side latency (total round-trip time) for Pinecone operations
- Captures server-side latency from the
x-pinecone-response-duration-msheader - Calculates network overhead (client - server duration)
- Exports metrics to OpenTelemetry-compatible backends (Prometheus, Grafana, Datadog, etc.)
| Metric | Type | Description |
|---|---|---|
db.client.operation.duration |
Histogram | Client-measured round-trip time (ms) |
pinecone.server.processing.duration |
Histogram | Server processing time from header (ms) |
db.client.operation.count |
Counter | Total number of operations |
| Attribute | Description |
|---|---|
db.system |
Always "pinecone" |
db.operation.name |
Operation type (upsert, query, fetch, update, delete) |
db.namespace |
Pinecone namespace |
pinecone.index_name |
Index name |
server.address |
Pinecone host |
status |
"success" or "error" |
- Java 8+
- Maven 3.6+
- Docker and Docker Compose
- A Pinecone account with an API key and index
java-otel-metrics/
├── pom.xml # Maven dependencies
├── README.md # This file
├── observability/ # Local observability stack
│ ├── docker-compose.yml # Prometheus + Grafana + OTel Collector
│ ├── otel-collector-config.yaml # OTel Collector configuration
│ └── prometheus.yml # Prometheus scrape config
└── src/main/java/pineconeexamples/
├── PineconeOtelMetricsExample.java # Main example
└── PineconeMetricsRecorder.java # Reusable metrics recorder
cd examples/java-otel-metrics/observability
docker-compose up -dThis starts:
- OpenTelemetry Collector (port 4317) - receives metrics via OTLP
- Prometheus (port 9090) - stores metrics
- Grafana (port 3000) - visualizes metrics
cd examples/java-otel-metrics
export PINECONE_API_KEY=your-api-key
export PINECONE_INDEX=your-index-name
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
mvn package exec:java -Dexec.mainClass="pineconeexamples.PineconeOtelMetricsExample"- Open http://localhost:3000
- Login with
admin/admin - Go to Connections → Data sources → Add data source
- Select Prometheus, set URL to
http://prometheus:9090, click Save & test - Go to Dashboards → New → New Dashboard → Add visualization
P50 Client vs Server Latency:
histogram_quantile(0.5, sum(rate(db_client_operation_duration_milliseconds_bucket[5m])) by (le))
histogram_quantile(0.5, sum(rate(pinecone_server_processing_duration_milliseconds_bucket[5m])) by (le))
P95 Latency by Operation:
histogram_quantile(0.95, sum(rate(db_client_operation_duration_milliseconds_bucket[5m])) by (le, db_operation_name))
Operation Count by Type:
sum by (db_operation_name) (db_client_operation_count_total)
| Percentile | Meaning |
|---|---|
| P50 | Median - typical latency |
| P90 | 90% of requests are faster |
| P95 | Tail latency - good for SLAs |
| P99 | Worst-case for most users |
The difference between client and server duration shows network overhead:
Network Overhead = Client Duration - Server Duration
This helps identify whether latency issues are:
- Server-side (high server duration)
- Network-side (high network overhead)
cd examples/java-otel-metrics/observability
docker-compose downCopy PineconeMetricsRecorder.java into your project:
Meter meter = meterProvider.get("pinecone.client");
PineconeMetricsRecorder recorder = new PineconeMetricsRecorder(meter);
Pinecone client = new Pinecone.Builder(apiKey)
.withResponseMetadataListener(recorder)
.build();
// All operations now emit metrics automatically
Index index = client.getIndexConnection(indexName);
index.upsert(...); // Metrics recorded!
index.query(...); // Metrics recorded!