ServiceControl instances collect and analyze data about the endpoints that make up a system and the messages flowing between them. This data is exposed to ServiceInsight and ServicePulse via an HTTP API and SignalR, and via external integration events.
Each endpoint in the system should be configured to send audit copies of every message that is processed into a central audit queue. A ServiceControl instance reads the messages in the audit queue and makes them available for visualization in ServiceInsight. ServiceControl can optionally forward these messages into an Audit Log queue for further processing if required.
Each endpoint in the system should be configured to send failed messages to a central error queue after those messages have gone through immediate and delayed retries. A ServiceControl instance reads the messages in the error queue and makes them available to be retried manually in ServicePulse and ServiceInsight. ServiceControl can optionally forward these messages into an Error Log queue for further processing if required.
Each endpoint may have additional plugins installed which collect and send data to a ServiceControl instance. The Heartbeats plugin can be used to detect which endpoint instances are running and which are offline. The Custom Checks plugin enables endpoints to send user-defined health reports to ServiceControl on a regular schedule. The Saga Audit plugin instruments audit messages with details of saga state changes for visualization in ServiceInsight.
Each ServiceControl instance raises external integration events when important situations are detected. These are standard NServiceBus events that can be subscribed to by any NServiceBus endpoint. See Use ServiceControl events for a complete list.
Each ServiceControl instance stores data in an embedded database. Audit data is retained for 30 days. Failed message data is retained until the message is retried or manually archived. These retention periods can be customized.
Each environment should have a single audit queue and a single error queue that all endpoints are configured to use. Each environment should have a single ServiceControl instance that is connected to it's audit and error queues. Consider the advice given in the Planning section of the documentation before creating a new ServiceControl instance.
ServiceControl includes some basic self-monitoring implemented as custom checks. These checks will be reported in ServicePulse alongside any custom checks being reported from endpoints.
MSMQ servers have a single transactional dead letter queue. Messages that cannot be delivered to queues located on remote servers will eventually be moved to the transactional dead letter queue when the MSMQ service is unable to deliver the message. ServiceControl will monitor the transactional dead letter queue on the server it is installed on as the presence of messages in this queue may indicate a problem with delivering message retries.
Azure Service Bus queues each come with an associated dead letter queue. When ServiceControl sends a message for retry it utilizes a staging queue to do so. ServiceControl will monitor the dead letter queue of the ServiceControl staging queue as the presence of messages in this queue indicates a problem with delivering message retries.
When ServiceControl is unable to properly import an audit or error message, the error is logged and the message is stored separately in ServiceControl. ServiceControl will monitor these failed import stores and notify when any are found. Read more about re-importing failed messages here.
When ServiceControl has difficulty connecting to the configured transport, the error message ingestion process is shut down for sixty seconds. These shutdowns are monitored with a custom check. The time to wait before restarting the error ingestion process is configurable with the ServiceControl/TimeToRestartErrorIngestionAfterFailure setting.
ServiceControl stores messages in an embedded database on the local file system. If the disk runs out of space, message ingestion will fail and the ServiceControl instance will stop. This type of failure can cause instability in the database, even after storage space has been increased. ServiceControl will monitor the remaining storage space on the drive containing the embedded message store. The check will report a failure if the hard drive has less than 20% remaining of its total capacity. This threshold can be changed with the ServiceControl/DataSpaceRemainingThreshold setting.