|
| 1 | +# Native OpenTelemetry Agent Identity |
| 2 | + |
| 3 | +This guide shows how to reproduce the OpenObserve Python SDK's trace behavior with the native OpenTelemetry Python SDK. |
| 4 | + |
| 5 | +Use this when you already own your OpenTelemetry setup and do not want `openobserve_init(...)` to create providers/exporters for you. |
| 6 | + |
| 7 | +## What The OpenObserve SDK Does |
| 8 | + |
| 9 | +For traces, the SDK mainly does four things: |
| 10 | + |
| 11 | +1. Creates an OpenTelemetry `TracerProvider`. |
| 12 | +2. Configures OTLP exporters for OpenObserve endpoints and headers. |
| 13 | +3. Applies `resource_attributes` through an OpenTelemetry `Resource`. |
| 14 | +4. Adds agent identity to spans with `gen_ai.agent.id` and `gen_ai.agent.name`. |
| 15 | + |
| 16 | +The agent identity behavior is intentionally span-based. `agent_id=` and `agent_name=` in `openobserve_init(...)` are process-wide defaults stamped on spans, not resource attributes. `openobserve_agent(...)` sets request-scoped identity using OpenTelemetry baggage and overrides those defaults. |
| 17 | + |
| 18 | +## Option 1: Resource Attributes Only |
| 19 | + |
| 20 | +This is the easiest native OpenTelemetry option when one process represents one static agent. |
| 21 | + |
| 22 | +```python |
| 23 | +from opentelemetry import trace |
| 24 | +from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter |
| 25 | +from opentelemetry.sdk.resources import Resource |
| 26 | +from opentelemetry.sdk.trace import TracerProvider |
| 27 | +from opentelemetry.sdk.trace.export import BatchSpanProcessor |
| 28 | + |
| 29 | +resource = Resource.create( |
| 30 | + { |
| 31 | + "service.name": "support-agent-worker", |
| 32 | + "service.version": "1.0.0", |
| 33 | + "deployment.environment": "production", |
| 34 | + "gen_ai.agent.name": "Support Agent", |
| 35 | + } |
| 36 | +) |
| 37 | + |
| 38 | +provider = TracerProvider(resource=resource) |
| 39 | +exporter = OTLPSpanExporter( |
| 40 | + endpoint="http://localhost:5080/api/default/v1/traces", |
| 41 | + headers={ |
| 42 | + "Authorization": "Basic <base64>", |
| 43 | + "stream-name": "default", |
| 44 | + }, |
| 45 | +) |
| 46 | +provider.add_span_processor(BatchSpanProcessor(exporter)) |
| 47 | +trace.set_tracer_provider(provider) |
| 48 | +``` |
| 49 | + |
| 50 | +Resource attributes are associated with all spans emitted by that provider through the OTLP `ResourceSpans` envelope. In OpenObserve, resource attributes are stored as service/resource fields. Current OpenObserve agent extraction can use resource `gen_ai.agent.name` / `gen_ai.agent.id` as a fallback for LLM spans and promote them into canonical agent fields. |
| 51 | + |
| 52 | +Use this only when: |
| 53 | + |
| 54 | +- The process has one static agent identity. |
| 55 | +- The identity does not change per request. |
| 56 | +- You are comfortable with resource identity being a fallback for LLM spans. |
| 57 | + |
| 58 | +Do not use this as the only mechanism when one process handles multiple agents or request-scoped agent identity. |
| 59 | + |
| 60 | +## Option 2: Span Processor Defaults |
| 61 | + |
| 62 | +This is closer to `openobserve_init(agent_id=..., agent_name=...)`: set a process-wide default identity on every recording span as span attributes. |
| 63 | + |
| 64 | +```python |
| 65 | +from contextvars import ContextVar |
| 66 | + |
| 67 | +from opentelemetry import baggage |
| 68 | +from opentelemetry.sdk.trace import SpanProcessor |
| 69 | + |
| 70 | +AGENT_ID_KEY = "gen_ai.agent.id" |
| 71 | +AGENT_NAME_KEY = "gen_ai.agent.name" |
| 72 | +_current_agent_identity = ContextVar("agent_identity", default=None) |
| 73 | + |
| 74 | + |
| 75 | +def _clean(value): |
| 76 | + if value is None: |
| 77 | + return None |
| 78 | + value = str(value).strip() |
| 79 | + return value or None |
| 80 | + |
| 81 | + |
| 82 | +class AgentIdentitySpanProcessor(SpanProcessor): |
| 83 | + def __init__(self, agent_id=None, agent_name=None): |
| 84 | + self.agent_id = _clean(agent_id) |
| 85 | + self.agent_name = _clean(agent_name) |
| 86 | + |
| 87 | + def on_start(self, span, parent_context=None): |
| 88 | + if not span.is_recording(): |
| 89 | + return |
| 90 | + |
| 91 | + local_identity = _current_agent_identity.get() |
| 92 | + if local_identity is not None: |
| 93 | + # Exact SDK behavior: a local partial identity does not mix with |
| 94 | + # static defaults or inherited baggage. |
| 95 | + agent_id, agent_name = local_identity |
| 96 | + elif self.agent_id is not None or self.agent_name is not None: |
| 97 | + agent_id = self.agent_id |
| 98 | + agent_name = self.agent_name |
| 99 | + else: |
| 100 | + agent_id = _clean(baggage.get_baggage(AGENT_ID_KEY, context=parent_context)) |
| 101 | + agent_name = _clean(baggage.get_baggage(AGENT_NAME_KEY, context=parent_context)) |
| 102 | + |
| 103 | + if agent_id is not None: |
| 104 | + span.set_attribute(AGENT_ID_KEY, agent_id) |
| 105 | + if agent_name is not None: |
| 106 | + span.set_attribute(AGENT_NAME_KEY, agent_name) |
| 107 | + |
| 108 | + def on_end(self, span): |
| 109 | + pass |
| 110 | + |
| 111 | + def shutdown(self): |
| 112 | + pass |
| 113 | + |
| 114 | + def force_flush(self, timeout_millis=30000): |
| 115 | + return True |
| 116 | +``` |
| 117 | + |
| 118 | +Register it before the exporter processor: |
| 119 | + |
| 120 | +```python |
| 121 | +provider = TracerProvider(resource=resource) |
| 122 | +provider.add_span_processor( |
| 123 | + AgentIdentitySpanProcessor( |
| 124 | + agent_id="support-agent", |
| 125 | + agent_name="Support Agent", |
| 126 | + ) |
| 127 | +) |
| 128 | +provider.add_span_processor(BatchSpanProcessor(exporter)) |
| 129 | +trace.set_tracer_provider(provider) |
| 130 | +``` |
| 131 | + |
| 132 | +This makes every span self-identifying with normal span attributes. That is the safest form for OpenObserve agent attribution, especially for evaluations and non-LLM helper spans that should still carry agent identity. |
| 133 | + |
| 134 | +## Option 3: Request-Scoped Identity With Baggage |
| 135 | + |
| 136 | +This is the native equivalent of `openobserve_agent(...)`. |
| 137 | + |
| 138 | +```python |
| 139 | +from contextlib import contextmanager |
| 140 | + |
| 141 | +from opentelemetry import baggage, context |
| 142 | + |
| 143 | +AGENT_ID_KEY = "gen_ai.agent.id" |
| 144 | +AGENT_NAME_KEY = "gen_ai.agent.name" |
| 145 | + |
| 146 | + |
| 147 | +@contextmanager |
| 148 | +def agent_identity(agent_id=None, agent_name=None): |
| 149 | + agent_id = _clean(agent_id) |
| 150 | + agent_name = _clean(agent_name) |
| 151 | + if agent_id is None and agent_name is None: |
| 152 | + raise ValueError("Agent identity requires at least one of agent_id or agent_name") |
| 153 | + |
| 154 | + local_token = _current_agent_identity.set((agent_id, agent_name)) |
| 155 | + ctx = context.get_current() |
| 156 | + if agent_id is not None: |
| 157 | + ctx = baggage.set_baggage(AGENT_ID_KEY, agent_id, context=ctx) |
| 158 | + else: |
| 159 | + ctx = baggage.remove_baggage(AGENT_ID_KEY, context=ctx) |
| 160 | + |
| 161 | + if agent_name is not None: |
| 162 | + ctx = baggage.set_baggage(AGENT_NAME_KEY, agent_name, context=ctx) |
| 163 | + else: |
| 164 | + ctx = baggage.remove_baggage(AGENT_NAME_KEY, context=ctx) |
| 165 | + |
| 166 | + token = context.attach(ctx) |
| 167 | + try: |
| 168 | + yield |
| 169 | + finally: |
| 170 | + context.detach(token) |
| 171 | + _current_agent_identity.reset(local_token) |
| 172 | +``` |
| 173 | + |
| 174 | +Use it around request or workflow execution: |
| 175 | + |
| 176 | +```python |
| 177 | +with agent_identity(agent_id="triage", agent_name="Triage Agent"): |
| 178 | + run_agent_workflow() |
| 179 | +``` |
| 180 | + |
| 181 | +If outbound HTTP instrumentation is configured to propagate W3C baggage, downstream services can inherit the same identity. The downstream service can still override it with its own local or static identity. |
| 182 | + |
| 183 | +The `ContextVar` is not used for propagation. It exists to mirror the SDK's precedence rules inside the current process: local request identity wins, static process identity is next, inherited baggage is last. This also prevents partial local identities from mixing with inherited or static fields. |
| 184 | + |
| 185 | +## Explicit Baggage Propagation |
| 186 | + |
| 187 | +If your application has custom propagation setup, make sure baggage is included: |
| 188 | + |
| 189 | +```python |
| 190 | +from opentelemetry.propagate import set_global_textmap |
| 191 | +from opentelemetry.propagators.composite import CompositePropagator |
| 192 | +from opentelemetry.trace.propagation.tracecontext import TraceContextTextMapPropagator |
| 193 | +from opentelemetry.baggage.propagation import W3CBaggagePropagator |
| 194 | + |
| 195 | +set_global_textmap( |
| 196 | + CompositePropagator( |
| 197 | + [ |
| 198 | + TraceContextTextMapPropagator(), |
| 199 | + W3CBaggagePropagator(), |
| 200 | + ] |
| 201 | + ) |
| 202 | +) |
| 203 | +``` |
| 204 | + |
| 205 | +## Full Native Trace Setup |
| 206 | + |
| 207 | +```python |
| 208 | +from opentelemetry import trace |
| 209 | +from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter |
| 210 | +from opentelemetry.sdk.resources import Resource |
| 211 | +from opentelemetry.sdk.trace import TracerProvider |
| 212 | +from opentelemetry.sdk.trace.export import BatchSpanProcessor |
| 213 | + |
| 214 | +resource = Resource.create( |
| 215 | + { |
| 216 | + "service.name": "o2-ai", |
| 217 | + "service.version": "1.0.0", |
| 218 | + "deployment.environment": "production", |
| 219 | + } |
| 220 | +) |
| 221 | + |
| 222 | +exporter = OTLPSpanExporter( |
| 223 | + endpoint="http://localhost:5080/api/default/v1/traces", |
| 224 | + headers={ |
| 225 | + "Authorization": "Basic <base64>", |
| 226 | + "stream-name": "default", |
| 227 | + }, |
| 228 | + timeout=30, |
| 229 | +) |
| 230 | + |
| 231 | +provider = TracerProvider(resource=resource) |
| 232 | +provider.add_span_processor( |
| 233 | + AgentIdentitySpanProcessor(agent_id="o2-ai", agent_name="O2 AI") |
| 234 | +) |
| 235 | +provider.add_span_processor(BatchSpanProcessor(exporter)) |
| 236 | +trace.set_tracer_provider(provider) |
| 237 | + |
| 238 | +tracer = trace.get_tracer("o2-ai") |
| 239 | + |
| 240 | +with agent_identity(agent_id="sre", agent_name="SRE Agent"): |
| 241 | + with tracer.start_as_current_span("agent.workflow") as span: |
| 242 | + span.set_attribute("gen_ai.operation.name", "chat") |
| 243 | + run_agent_workflow() |
| 244 | +``` |
| 245 | + |
| 246 | +## HTTP And gRPC OpenObserve Endpoints |
| 247 | + |
| 248 | +For HTTP/protobuf traces: |
| 249 | + |
| 250 | +```python |
| 251 | +OTLPSpanExporter( |
| 252 | + endpoint="https://<openobserve-host>/api/<org>/v1/traces", |
| 253 | + headers={ |
| 254 | + "Authorization": "Basic <base64>", |
| 255 | + "stream-name": "default", |
| 256 | + }, |
| 257 | +) |
| 258 | +``` |
| 259 | + |
| 260 | +For gRPC traces, use the gRPC exporter and pass OpenObserve headers as metadata: |
| 261 | + |
| 262 | +```python |
| 263 | +from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter |
| 264 | + |
| 265 | +OTLPSpanExporter( |
| 266 | + endpoint="<openobserve-host>:5081", |
| 267 | + headers=( |
| 268 | + ("authorization", "Basic <base64>"), |
| 269 | + ("organization", "default"), |
| 270 | + ("stream-name", "default"), |
| 271 | + ), |
| 272 | + insecure=False, |
| 273 | +) |
| 274 | +``` |
| 275 | + |
| 276 | +Use lowercase header names for gRPC metadata. |
| 277 | + |
| 278 | +## Choosing The Right Approach |
| 279 | + |
| 280 | +| Requirement | Native OTel approach | |
| 281 | +| --- | --- | |
| 282 | +| One static agent per process | Resource attr can be enough | |
| 283 | +| OpenObserve agent attribution on LLM spans | Resource fallback or span processor | |
| 284 | +| Every span should be self-identifying | Span processor | |
| 285 | +| One process handles multiple agents | Request-scoped baggage plus span processor | |
| 286 | +| Identity should propagate to downstream services | Baggage propagation | |
| 287 | +| Span identity should override inherited identity | Local request-scoped baggage/context | |
| 288 | + |
| 289 | +## OpenObserve Behavior Notes |
| 290 | + |
| 291 | +- Span attributes named `gen_ai.agent.name` and `gen_ai.agent.id` are the primary agent identity fields. |
| 292 | +- Resource attributes are stored separately as resource/service metadata. OpenObserve prefixes non-`service.name` resource fields internally. |
| 293 | +- Current OpenObserve extraction can use resource agent fields as a fallback for LLM spans. |
| 294 | +- Span-level agent fields win over resource-level fields. |
| 295 | +- If a process can emit spans for multiple agents, prefer span attributes over resource attributes. |
| 296 | + |
| 297 | +## What Native OTel Does Not Give You Automatically |
| 298 | + |
| 299 | +Using native OpenTelemetry means you own: |
| 300 | + |
| 301 | +- OpenObserve endpoint construction. |
| 302 | +- OpenObserve authorization and stream headers. |
| 303 | +- Provider singleton lifecycle. |
| 304 | +- Flush and shutdown behavior. |
| 305 | +- Agent identity precedence rules. |
| 306 | +- Baggage propagation setup. |
| 307 | +- Logs and metrics exporter setup, if you need those signals. |
| 308 | + |
| 309 | +Use the OpenObserve SDK when you want those defaults managed for you. Use native OpenTelemetry when your application already has a provider/exporter lifecycle and you only need to add OpenObserve-compatible conventions. |
0 commit comments