Skip to content


Monitoring is a feature used for observing and tracking performance, availability, and health of the system. We use Prometheus, a powerful open-source monitoring and alerting toolkit.

Note that for now, Windows is not supported.

How It Works

To collect telemetry data, Prometheus utilizes an endpoint called /metrics. This endpoint serves as the interface through which metrics are exposed and made available for scraping. Then the data is saved to time series database (TSDB) and can be later analyzed, exported, or visualized by means of Prometheus components or third-party tools. Additionally, an alert manager can be set up to trigger notifications.

We created a simple Grafana dashboard as an example to visualize Face SDK Web API metrics. You can download and import this dashboard into your Grafana instance. For detailed instructions on installing Grafana and managing dashboards, refer to the official Grafana documentation.


For more information about the toolkit ecosystem and architecture, see Prometheus overview.

Quick Start

To start collecting the system telemetry, follow the steps below.

1. Install the Prometheus server and customize its config prometheus.yml to scrape metrics from the target URL where your Face SDK Web Service instance is deployed. For details, see First steps with Prometheus.

2. Enable collecting metrics:

      enabled: true
      path: "metrics"
Parameter Type Default Description
enabled boolean true
(for all OS except Windows)
Whether to collect Prometheus metrics.
path string metrics Specifies the custom file path to save metrics.

From now on, the metrics are automatically written into TSDB. In Prometheus ecosystem, you can retrieve data employing Prometheus Query Language (PromQL), visualize information via expression browser or integrating Grafana, or configure the notifications adding the alert manager.

Available Metrics

Face SDK Web Service together with Prometheus provide you with numerous metrics for request-response analysis. See the overview and useful references further.

Gunicorn metrics

Gunicorn provides the following metrics for monitoring and analyzing HTTP requests:

Summary metrics

  • gunicorn_queue_time: Measures the time a request spends in the queue (latency).
  • Sub-metrics:
    • gunicorn_queue_time_count: Total count of requests in the queue.
    • gunicorn_queue_time_sum: Sum of the time spent by requests in the queue.

Histogram metrics

  • gunicorn_request_time_histogram: Represents the time a request spends in the queue as a histogram.
  • Sub-metrics:
    • gunicorn_request_time_histogram_count: Total count of requests in the histogram.
    • gunicorn_request_time_histogram_bucket: Distribution of requests in different time intervals.
    • gunicorn_request_time_histogram_sum: Sum of the time spent by requests in the histogram.

HTTP Request Analysis Metrics

For analyzing HTTP requests, the following metrics are available:

Counter Metric

  • flask_http_request_total: Tracks the total count of requests.

Histogram metrics

  • flask_http_request_duration_seconds_bucket: Represents the duration of request execution in different time intervals.
  • flask_http_request_duration_seconds_count: Total count of requests in the histogram.
  • flask_http_request_duration_seconds_sum: Sum of the durations of requests in the histogram.

You can filter the metrics using the following parameters:

  • method: Filters requests by HTTP method (e.g., GET, POST, etc.).
  • path: Filters requests by the path (/api/match, /api/match_and_search, /api/detect, /api/groups, /api/search, /api/v2/liveness).
  • status: Filters requests by the response code (e.g., 200, 404, etc.).