See how using ServicePulse →
Introduction
Prometheus is a monitoring solution for storing time series data like metrics. Grafana allows to visualize the data stored in Prometheus (and other sources). This sample demonstrates how to capture NServiceBus metrics, storing these in Prometheus and visualizing these metrics using Grafana.
This sample reports the following metrics to Prometheus:
- Fetched messages per second
- Failed messages per second
- Successful messages per second
- Critical time in seconds
- Processing time seconds
For a detailed explanation of these metrics refer to the metrics captured section in the metrics documentation.
Prerequisites
To run this sample, download and run both Prometheus and Grafana. This sample uses Prometheus and Grafana.
Code overview
The sample simulates messages load with a random 10% failure rate using the LoadSimulator
class:
var simulator = new LoadSimulator(endpointInstance, TimeSpan.Zero, TimeSpan.FromSeconds(10));
simulator.Start();
Capturing metric values
A Prometheus service is hosted inside an endpoint via the NuGet package prometheus-net
. The service enables Prometheus to scrape data gathered by the metrics package. In the sample the service that exposes the data to scrape is hosted on http:/
. The service is started and stopped inside a feature startup task as shown below
class MetricServerTask : FeatureStartupTask
{
MetricServer metricServer = new MetricServer(port: 3030);
protected override Task OnStart(IMessageSession session, CancellationToken cancellationToken = default)
{
metricServer.Start();
return Task.CompletedTask;
}
protected override Task OnStop(IMessageSession session, CancellationToken cancellationToken = default)
{
metricServer.Stop();
return Task.CompletedTask;
}
}
Custom observers need to be registered for the metric probes provided via NServiceBus.
. This is all setup in the PrometheusFeature
endpointConfiguration.EnableMetrics();
The names provided by the NServiceBus.
probes are not compatible with Prometheus. The NServiceBus.
names need to be aligned with the naming conventions defined by Prometheus by mapping them accordingly
Counters: nservicebus_{counter-name}_total
Summaries: nservicebus_{summary-name}_seconds
Dictionary<string, string> nameMapping = new Dictionary<string, string>
{
// https://prometheus.io/docs/practices/naming/
{"# of msgs successfully processed / sec", "nservicebus_success_total"},
{"# of msgs pulled from the input queue /sec", "nservicebus_fetched_total"},
{"# of msgs failures / sec", "nservicebus_failure_total"},
{"Critical Time", "nservicebus_criticaltime_seconds"},
{"Processing Time", "nservicebus_processingtime_seconds"},
{"Retries", "nservicebus_retries_total"},
};
The registered observers convert NServiceBus.
Signals to Prometheus Counters and NServiceBus.
Durations to Prometheus Summaries. Additionally, labels are added that identify the endpoint, the endpoint queue and more within Prometheus. With these labels, it is possible to filter and group metric values.
var instanceQueueAddress = context.InstanceSpecificQueueAddress();
var labelValues = new[]
{
settings.EndpointName(),
Environment.MachineName,
Dns.GetHostName(),
context.LocalQueueAddress().ToString(),
instanceQueueAddress != null ? instanceQueueAddress.Discriminator : null,
};
var metricsOptions = settings.Get<MetricsOptions>();
metricsOptions.RegisterObservers(
register: probeContext =>
{
RegisterProbes(probeContext, labelValues);
});
During the registration the following steps are required:
- Map metric names
- Register observer callbacks
- Create summaries and counters with corresponding labels
- Invoke the summaries and counters in the observer callback
foreach (var duration in context.Durations)
{
if (!nameMapping.ContainsKey(duration.Name))
{
log.WarnFormat("Unsupported duration probe {0}", duration.Name);
continue;
}
var prometheusName = nameMapping[duration.Name];
var summary = Metrics.CreateSummary(prometheusName, duration.Description,
new SummaryConfiguration
{
Objectives = new[]
{
new QuantileEpsilonPair(0.5, 0.05),
new QuantileEpsilonPair(0.9, 0.01),
new QuantileEpsilonPair(0.99, 0.001)
},
LabelNames = Labels
});
duration.Register((ref DurationEvent @event) => summary.Labels(labelValues).Observe(@event.Duration.TotalSeconds));
}
foreach (var signal in context.Signals)
{
if (!nameMapping.ContainsKey(signal.Name))
{
log.WarnFormat("Unsupported signal probe {0}", signal.Name);
continue;
}
var prometheusName = nameMapping[signal.Name];
var counter = Metrics.CreateCounter(prometheusName, signal.Description, Labels);
signal.Register((ref SignalEvent @event) => counter.Labels(labelValues).Inc());
}
Prometheus
Prometheus needs to be configured to pull data from the endpoint. For more information how to setup Prometheus refer to the getting started guide.
Guided configuration
Copy the following files into the root folder of the Prometheus installation.
Overwrite the existing prometheus.
in the Prometheus demo installation. Or proceed with the manual configuration if desired.
Manual configuration
Add a target
Edit prometheus.
and add a new target for scraping similar to
- job_name: 'nservicebus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:3030']
Define rules
Queries can be expensive operations. Prometheus allows defining pre-calculated queries by configuring rules that calculate rates based on the counters.
groups:
- name: NServiceBus
rules:
- record: nservicebus_success_total:avg_rate5m
expr: avg(rate(nservicebus_success_total[5m]))
- record: nservicebus_failure_total:avg_rate5m
expr: avg(rate(nservicebus_failure_total[5m]))
- record: nservicebus_fetched_total:avg_rate5m
expr: avg(rate(nservicebus_fetched_total[5m]))
The pre-calculated query can then be used.
nservicebus_success_total:avg_rate5m
For efficiency reasons the sample dashboard shown later requires three queries defined in a rules file. Create nservicebus.
in the root folder of the Prometheus installation and add the three rules as defined above.
To enable the rules edit prometheus.
and add:
rule_files:
- 'nservicebus.rules.txt'
Show a graph
Start Prometheus and open http:/
in a web browser.
NServiceBus pushes events for success, failure, and fetched. These events need to be converted to rates by a query:
avg(rate(nservicebus_success_total[5m]))
Example configuration
Prometheus configuration files demonstrating the concepts from this sample:
Grafana
Grafana needs to be installed and configured to display the data available in Prometheus. For more information how to install Grafana refer to the Installation Guide.
Guided configuration
Execute setup.
in a PowerShell with elevated permission and provide the username and password to authenticate with Grafana. This script will
- Create a data source called
PrometheusNServiceBusDemo
- Import the sample dashboard and connect it to the data source
Manual configuration
Datasource
Create a new data source called PrometheusNServiceBusDemo
. For more information how to define a Prometheus data source refer to Using Prometheus in Grafana.
Dashboard
To graph the Prometheus rule nservicebus_failure_total:avg_rate5m
the following steps have to be performed:
- Add a new dashboard
- Add a graph
- Click its title to edit
- Click the Metric tab
Dashboard
The sample included an export of the grafana dashboard, this can be imported as a reference.