Prometheus (software)

Free application for event monitoring and alerting From Wikipedia, the free encyclopedia

Prometheus is a free software application for event monitoring and alerting.[2] It records metrics in a time series database built using an HTTP pull model, supporting high dimensionality through key-value label pairs, flexible queries, and real-time alerting.[3] The project is written in Go and licensed under the Apache 2.0 License, with source code available on GitHub.[4]

Initial releaseNovember 24, 2012; 13 years ago (2012-11-24)
Stable release
v3.2.1[1] / February 26, 2025; 12 months ago (2025-02-26)
Written inGo
Quick facts Initial release, Stable release ...
Prometheus
Initial releaseNovember 24, 2012; 13 years ago (2012-11-24)
Stable release
v3.2.1[1] / February 26, 2025; 12 months ago (2025-02-26)
Written inGo
Operating systemCross-platform
TypeTime series database
LicenseApache License 2.0
Websiteprometheus.io
Repositorygithub.com/prometheus/prometheus
Close

Prometheus originated at SoundCloud in 2012 and was accepted by the Cloud Native Computing Foundation (CNCF) in 2016, graduating from incubation in 2018. It is commonly paired with Grafana for dashboard visualization and supports a wide range of exporters and integrations.

History

Prometheus was developed at SoundCloud starting in 2012,[5] after the company found that its existing metrics tools, based on StatsD and Graphite, could not meet the demands of its containerized infrastructure. The design goals included a multi-dimensional data model, operational simplicity, scalable data collection, and a powerful query language in a single tool.[6] The project was open source from the start and was adopted by Boxever and Docker users before any official announcement.[6][7]

The design was influenced by Borgmon, Google's internal time-series monitoring system, which treated time-series data as a source for alert generation.[8][9]

By 2013, Prometheus was in production use at SoundCloud. The project was publicly announced in January 2015.[6]

In May 2016, the Cloud Native Computing Foundation accepted Prometheus as its second incubated project, after Kubernetes.[10] In August 2018, the CNCF announced that Prometheus had graduated from incubation.[11]

Versions

Prometheus 1.0 was released in July 2016.[12] Subsequent releases through 2016 and 2017 led to Prometheus 2.0 in November 2017, which introduced a new storage engine with significantly improved performance and reduced disk usage.[13]

Architecture

A typical Prometheus monitoring deployment consists of several components working together.[5] Exporters run on monitored hosts to collect and expose local metrics. The Prometheus server scrapes those exporters at a configured interval, aggregates the data, and stores it locally. Alertmanager[14] receives alerts from Prometheus and handles routing, grouping, and silencing before forwarding notifications. Grafana is commonly used to build dashboards from Prometheus data. Queries against all of these are written in PromQL, Prometheus's native query language.

Data model

Prometheus data is organized as named metrics, each optionally qualified by an arbitrary number of key-value label pairs. Labels can identify the data source (server name, datacenter) or carry application-specific context such as HTTP status code, request method, or endpoint. Querying in real time against any combination of labels is what makes the data model multi-dimensional.[15][6][7]

Prometheus stores data locally on disk for fast writes and queries.[6] Metrics can also be forwarded to remote storage backends, including Grafana Mimir and other Prometheus-compatible systems.[16]

Data collection

Prometheus collects data through a pull model: the server periodically queries a configured list of targets (exporters) and aggregates the returned time-series values.[6] Prometheus includes several service discovery mechanisms to automatically locate targets in dynamic environments.[17]

PromQL

Prometheus provides its own query language, PromQL (Prometheus Query Language), which allows users to select and aggregate time-series data. The language includes time-oriented constructs such as the rate() function, instant vectors, and range vectors that return multiple samples per series over a specified time window.[18]

Prometheus defines four metric types that PromQL operates on:[19] Counter (a monotonically increasing value), Gauge (an arbitrary value that can go up or down), Histogram (samples observations and counts them in configurable buckets), and Summary (similar to Histogram but calculates quantiles on the client side).

Example

# A metric with label filtering
go_gc_duration_seconds{instance="localhost:9090", job="alertmanager"}

# Aggregation operators
sum by (app, proc) (
  instance_memory_limit_bytes - instance_memory_usage_bytes
) / 1024 / 1024

[20]

Alerting

Alert rules in Prometheus specify a condition and a duration; if the condition holds for that duration, Prometheus fires an alert to Alertmanager. Alertmanager handles silencing, inhibition, and routing to notification destinations including email, Slack, and PagerDuty.[21] Additional targets such as Microsoft Teams[22] can be reached through the Alertmanager webhook receiver interface.[23]

Time series database

Prometheus includes its own time series database. Recent data (by default, one to three hours) is held in a combination of memory[24] and mmap-backed files.[25] Older data is written to persistent blocks indexed with an inverted index, which suits Prometheus's label-based query patterns.[26][27] A background compaction process merges smaller blocks into larger ones to reduce read overhead.[28] Durability against crashes is provided by a write-ahead log (WAL).[29]

Dashboards

Prometheus includes a basic expression browser but is not a full dashboard system. Grafana is the standard pairing, querying Prometheus via PromQL to produce dashboards; the need to deploy and maintain Grafana separately is sometimes cited as an operational drawback.[30]

Interoperability

Prometheus favors white-box monitoring, where applications publish internal metrics for collection. Exporters and agents are available for many applications and systems.[31] For transition from existing monitoring stacks, Prometheus supports several protocols: Graphite, StatsD, SNMP, JMX, and CollectD.[32]

Metrics are typically retained for a few weeks. For longer retention, Prometheus can stream data to remote storage backends.[16]

OpenMetrics

An effort to standardize the Prometheus exposition format as OpenMetrics has gained adoption from several vendors, including InfluxData's TICK suite,[33] InfluxDB, Google Cloud Platform,[34] Datadog,[35] and New Relic.[36][37] The OpenMetrics specification is maintained separately from the Prometheus project.[38]

Library support

Prometheus client libraries are available for most major programming languages. The POCO C++ Libraries expose Prometheus metrics through the Poco::Prometheus namespace.[39]

See also

References

Further reading

Related Articles

Wikiwand AI