Skip to content

Commit 85fcce3

Browse files
committed
Update metric_types.md for native histograms
Note that I used this opportunity to replace the term "client library" with "instrumentation library". I always thought that "client library" is confusing as it is not implementing a client in any way. (Technically, it implements a _server_, of which the Prometheus "server" is the client… 🤯) Even if we accept that "Prometheus client library" just means "a library to do something that has to do with Prometheus", the title "client library" still doesn't tell us what the library is actually for. (Note that the client_golang repository not only contains an instrumentation library, but also includes an _actual_ client library that helps you to implement clients that talk to the Prometheus HTTP API.) Signed-off-by: beorn7 <beorn@grafana.com>
1 parent 48f6cd7 commit 85fcce3

2 files changed

Lines changed: 82 additions & 36 deletions

File tree

docs/concepts/metric_types.md

Lines changed: 79 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,17 @@ title: Metric types
33
sort_rank: 2
44
---
55

6-
The Prometheus client libraries offer four core metric types. These are
7-
currently only differentiated in the client libraries (to enable APIs tailored
8-
to the usage of the specific types) and in the wire protocol. The Prometheus
9-
server does not yet make use of the type information and flattens all data into
10-
untyped time series. This may change in the future.
6+
The Prometheus instrumentation libraries offer four core metric types. With the
7+
exception of native histograms, these are currently only differentiated in the
8+
instrumentation libraries (to enable APIs tailored to the usage of the specific types)
9+
and in the exposition protocols. The Prometheus server does not yet make use of
10+
the type information and flattens all types except native histograms into
11+
untyped time series of floating point values. Native histograms, however, are
12+
ingested as time series of special composite histogram samples. In the future,
13+
Prometheus might handle other metric types as [composite
14+
types](/blog/2026/02/14/modernizing-prometheus-composite-samples/), too. There
15+
is also ongoing work to persist the type information of the current simple
16+
types.
1117

1218
## Counter
1319

@@ -20,7 +26,7 @@ errors.
2026
Do not use a counter to expose a value that can decrease. For example, do not
2127
use a counter for the number of currently running processes; instead use a gauge.
2228

23-
Client library usage documentation for counters:
29+
Instrumentation library usage documentation for counters:
2430

2531
* [Go](http://godoc.org/github.com/prometheus/client_golang/prometheus#Counter)
2632
* [Java](https://prometheus.github.io/client_java/getting-started/metric-types/#counter)
@@ -38,7 +44,7 @@ Gauges are typically used for measured values like temperatures or current
3844
memory usage, but also "counts" that can go up and down, like the number of
3945
concurrent requests.
4046

41-
Client library usage documentation for gauges:
47+
Instrumentation library usage documentation for gauges:
4248

4349
* [Go](http://godoc.org/github.com/prometheus/client_golang/prometheus#Gauge)
4450
* [Java](https://prometheus.github.io/client_java/getting-started/metric-types/#gauge)
@@ -51,37 +57,77 @@ Client library usage documentation for gauges:
5157

5258
A _histogram_ samples observations (usually things like request durations or
5359
response sizes) and counts them in configurable buckets. It also provides a sum
54-
of all observed values.
55-
56-
A histogram with a base metric name of `<basename>` exposes multiple time series
57-
during a scrape:
58-
59-
* cumulative counters for the observation buckets, exposed as `<basename>_bucket{le="<upper inclusive bound>"}`
60+
of all observed values. As such, a histogram is essentially a bucketed counter.
61+
However, a histogram can also represent the current state of a distribution, in
62+
which case it is called a _gauge histogram_. Gauge histograms are rarely
63+
directly exposed by instrumented programs and are thus not (yet) usable in
64+
instrumentation libraries, but they are represented in newer versions of the protobuf
65+
exposition format and in [OpenMetrics](https://openmetrics.io/). They are also
66+
created regularly by PromQL expressions. For example, the outcome of applying
67+
the `rate` function to a counter histogram is a gauge histogram, as the same
68+
way as the outcome of applying the `rate` function to a counter is a gauge.
69+
70+
Histograms exists in two fundamentally different versions: The more recent
71+
_native histograms_ and the _classic histograms_.
72+
73+
A native histogram is exposed and ingested as composite samples, where each
74+
sample represents the count and sum of observations together with a dynamic set
75+
of buckets.
76+
77+
A classic histogram, however, consists of multiple time series of simple float
78+
samples. A classic histogram with a base metric name of `<basename>` results is
79+
the following time series:
80+
81+
* cumulative counters for the observation buckets, exposed as
82+
`<basename>_bucket{le="<upper inclusive bound>"}`
6083
* the **total sum** of all observed values, exposed as `<basename>_sum`
61-
* the **count** of events that have been observed, exposed as `<basename>_count` (identical to `<basename>_bucket{le="+Inf"}` above)
62-
63-
Use the
64-
[`histogram_quantile()` function](/docs/prometheus/latest/querying/functions/#histogram_quantile)
65-
to calculate quantiles from histograms or even aggregations of histograms. A
66-
histogram is also suitable to calculate an
67-
[Apdex score](http://en.wikipedia.org/wiki/Apdex). When operating on buckets,
68-
remember that the histogram is
69-
[cumulative](https://en.wikipedia.org/wiki/Histogram#Cumulative_histogram). See
70-
[histograms and summaries](/docs/practices/histograms) for details of histogram
71-
usage and differences to [summaries](#summary).
72-
73-
NOTE: Beginning with Prometheus v2.40, there is experimental support for native
74-
histograms. A native histogram requires only one time series, which includes a
75-
dynamic number of buckets in addition to the sum and count of
76-
observations. Native histograms allow much higher resolution at a fraction of
77-
the cost. Detailed documentation will follow once native histograms are closer
78-
to becoming a stable feature.
84+
* the **count** of events that have been observed, exposed as
85+
`<basename>_count` (identical to `<basename>_bucket{le="+Inf"}` above)
86+
87+
Native histograms are generally much more efficient than classic histograms,
88+
allow much higher resolution, and do not require explicit configuration of
89+
bucket boundaries during instrumentation. Their bucketing schema ensures that
90+
they are always aggregatable with each other, even if the resolution might have
91+
changed, while classic histograms with different bucket boundaries are not
92+
generally aggregatable. If the instrumentation library you are using supports native
93+
histograms (currently this is the case for Go and Java), you should probably
94+
prefer native histograms over classic histograms.
95+
96+
If you are stuck with classic histograms for whatever reason, there is a way to
97+
get at least some of the benefits of native histograms: You can configure
98+
Prometheus to ingest classic histograms into a special form of native
99+
histograms, called Native Histograms with Custom Bucket boundaries (NHCB).
100+
NHCBs are stored as the same composite samples as usual native histograms with
101+
the same gain in efficiency. However, their buckets are still the same buckets
102+
statically configured during instrumentation, with their limited resolution and
103+
range and the same problems of aggregatability upon changing the bucket
104+
boundaries.
105+
106+
Use the [`histogram_quantile()`
107+
function](/docs/prometheus/latest/querying/functions/#histogram_quantile) to
108+
calculate quantiles from histograms or even aggregations of histograms. It
109+
works for both classic and native histograms, using a slightly different
110+
syntax. Histograms are also suitable to calculate an [Apdex
111+
score](http://en.wikipedia.org/wiki/Apdex).
112+
113+
You can operate directly on the buckets of a classic histogram, as they are
114+
represented as individual series (cassed `<basename>_bucket{le="<upper
115+
inclusive bound>"}` as described above). Remember, however, that these buckets
116+
are [cumulative](https://en.wikipedia.org/wiki/Histogram#Cumulative_histogram),
117+
i.e. every bucket counts all observations less than or equal to the upper
118+
boundary provided as a label. With native histograms, use the
119+
[`histogram_fraction()`
120+
function](/docs/prometheus/latest/querying/functions/#histogram_fraction) to
121+
calculate fractions of observations within given boundaries.
122+
123+
See [histograms and summaries](/docs/practices/histograms) for details of
124+
histogram usage and differences to [summaries](#summary).
79125

80126
NOTE: Beginning with Prometheus v3.0, the values of the `le` label of classic
81127
histograms are normalized during ingestion to follow the format of
82128
[OpenMetrics Canonical Numbers](https://github.com/prometheus/OpenMetrics/blob/main/specification/OpenMetrics.md#considerations-canonical-numbers).
83129

84-
Client library usage documentation for histograms:
130+
Instrumentation library usage documentation for histograms:
85131

86132
* [Go](http://godoc.org/github.com/prometheus/client_golang/prometheus#Histogram)
87133
* [Java](https://prometheus.github.io/client_java/getting-started/metric-types/#histogram)
@@ -111,7 +157,7 @@ to [histograms](#histogram).
111157
NOTE: Beginning with Prometheus v3.0, the values of the `quantile` label are normalized during
112158
ingestion to follow the format of [OpenMetrics Canonical Numbers](https://github.com/prometheus/OpenMetrics/blob/main/specification/OpenMetrics.md#considerations-canonical-numbers).
113159

114-
Client library usage documentation for summaries:
160+
Instrumentation library usage documentation for summaries:
115161

116162
* [Go](http://godoc.org/github.com/prometheus/client_golang/prometheus#Summary)
117163
* [Java](https://prometheus.github.io/client_java/getting-started/metric-types/#summary)

docs/practices/histograms.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -397,9 +397,9 @@ Classic histogram version:
397397
histogram_quantile(0.95, sum by (le) (rate(http_request_duration_seconds_bucket[5m]))) // GOOD.
398398

399399
Furthermore, should your SLO change and you now want to plot the 90th
400-
percentile, or you want to take into account the last 10 minutes
401-
instead of the last 5 minutes, you only have to adjust the expressions
402-
above and you do not need to reconfigure the clients.
400+
percentile, or you want to take into account the last 10 minutes instead of the
401+
last 5 minutes, you only have to adjust the expressions above and you do not
402+
need to reconfigure the instrumentation of the monitored programs.
403403

404404
### Errors of quantile estimation
405405

0 commit comments

Comments
 (0)