Distributed Tracing
Passage records a trace for every player connection — a timeline of everything that happened from the moment a client connected until they were transferred to a backend server. These traces, together with the metrics Passage exports, give you a complete picture of your network’s health and player experience.
Prerequisite: Tracing and metrics must be enabled in configuration before anything appears in your observability stack. See Monitoring and Observability for setup instructions.
What a Trace Looks Like
Section titled “What a Trace Looks Like”Each player connection produces a single trace. In your tracing tool (Grafana Tempo, Jaeger, etc.) it appears as a timeline bar labelled passage with a duration equal to the full connection time — typically a few hundred milliseconds for a successful transfer.
The trace is broken into phases that reflect the Minecraft protocol:
| Phase | What it represents |
|---|---|
| Status check | Responding to a server-list ping (no login involved) |
| Authentication | Verifying the player’s identity with the configured auth adapter |
| Configuration | Delivering resource packs and keep-alive packets before the transfer |
| Transfer | Sending the backend address and closing the connection |
Slow or failed connections show up as unusually long or error-marked traces. A spike in authentication duration, for example, points directly to a slow or unreachable auth adapter.
Service Identity
Section titled “Service Identity”Every trace and metric Passage emits is tagged with the following labels so you can filter by environment and version in your dashboards:
| Label | Value |
|---|---|
| Service name | passage |
| Service namespace | scrayosnet |
| Service version | Passage version (e.g. 0.3.0) |
| Environment | Your otel.environment config value (e.g. production) |
Metrics Reference
Section titled “Metrics Reference”Connection Metrics
Section titled “Connection Metrics”These metrics give you a real-time view of traffic flowing through Passage.
| Metric | What it measures |
|---|---|
listener_requests | Total incoming connections. The decision label splits this into accepted (processed normally) and rejected (dropped by the rate limiter or a proxy protocol error). |
open_connections | How many player connections are currently being handled. |
connection_duration | How long connections take from start to finish, in seconds. Watch the p95/p99 here — a rise indicates something is slowing down the authentication or discovery phase. |
transfer_connections | Connections grouped by type: status (server-list pings), login (new player logins), or transfer (reconnecting players using a transfer cookie). |
rate_limiter_size | The number of IPs currently tracked by the rate limiter. This should stay small during normal operation and reset itself automatically. A high value may indicate a connection flood. |
client_locales | Distribution of player client languages. Useful for knowing which languages to prioritize for localized disconnect messages. |
client_view_distances | Distribution of view distances reported by clients during login. |
System Metrics
Section titled “System Metrics”When system_observer_interval is configured, Passage also reports host-level metrics. These help correlate player-facing issues with resource pressure on the host:
| Metric | What it measures |
|---|---|
cpu_usage | Overall CPU usage of the host (0–100%). |
total_memory / used_memory / free_memory / available_memory | System RAM in bytes. |
total_swap / used_swap / free_swap | Swap space in bytes. |
Correlating Traces Across Backend Servers
Section titled “Correlating Traces Across Backend Servers”When Passage creates a session cookie for a player, it embeds the current trace ID into the cookie. This means your backend servers can attach their own spans to the same trace, giving you an unbroken timeline from Passage all the way through your backend network.
The session cookie (passage:session) includes an extra field containing a traceparent value:
{ "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890", "server_address": "play.example.com", "server_port": 25565, "extra": { "traceparent": "00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01" }}The traceparent follows the W3C Trace Context standard and contains the trace ID that links the player’s entire journey. A backend server that reads this value and passes it to its own OpenTelemetry SDK will appear as a connected child in the same trace — no separate correlation step needed.
This enables scenarios like:
- Viewing a single trace that covers Passage authentication, resource pack delivery, and the player’s first few seconds on a lobby server
- Searching your trace backend by trace ID to find every service that touched a specific player connection
- Setting alerts on end-to-end latency rather than per-service latency
The authentication cookie (passage:authentication) also has an extra field but it does not carry trace context — only the session cookie does.
Recommended Dashboards and Alerts
Section titled “Recommended Dashboards and Alerts”Dashboards to build
Section titled “Dashboards to build”- Traffic overview:
listener_requeststotal rate, split bydecision. Shows overall throughput and rejection rate over time. - Connection latency:
connection_durationhistogram (p50, p95, p99). The single most useful signal for player-facing performance. - Connection types:
transfer_connectionsbystate. The ratio oftransfertologinshows how effectively auth cookies are working — more transfers means fewer Mojang API calls. - Active connections:
open_connectionsas a live gauge. Pair withconnection_durationto spot overload. - Host health:
cpu_usageandused_memoryalongside connection metrics to catch resource-pressure incidents.
Alerts to consider
Section titled “Alerts to consider”listener_requests{decision="rejected"}rate above baseline — possible flood or upstream misconfigurationconnection_durationp99 above ~2 seconds — adapter is slow or unreachableopen_connectionsgrowing without newlistener_requests— connections are stalling
Sampling in Production
Section titled “Sampling in Production”Every player connection produces a trace. For high-traffic networks this volume can be expensive to store. Most tracing backends support head-based sampling — configure your OTel Collector or tracing backend to keep only a percentage of traces (5–10% is usually enough for latency analysis). Error traces should always be kept regardless of the sampling rate.
If you use an OTel Collector between Passage and your backend, the probabilistic sampler processor is the simplest option:
processors: probabilistic_sampler: sampling_percentage: 10Passage itself does not perform sampling — all traces are exported and sampling decisions are left to the pipeline.