- See ServiceControl release notes Troubleshooting section for guidance on detecting ServiceControl HTTP API accessibility.
- Verify that ServicePulse is trying to access the correct ServiceControl URI (based on ServiceControl instance URI defined in ServicePulse installation settings).
- Check that ServicePulse is not blocked from accessing the ServiceControl URI by firewall settings.
- Follow the guidance in How to configure endpoints for monitoring by ServicePulse.
- Restart the endpoint after copying the Endpoint Plugin files into the endpoint's Bin directory.
- Ensure the endpoint references NServiceBus Version 4.0.0 or later.
- Ensure auditing is enabled for the endpoint, and the audited messages are forwarded to the correct audit and error queues monitored by ServiceControl.
- Ensure the relevant ServiceControl assemblies are included in the whitelist or are not excluded in the blacklist. For more details refer to Assembly scanning.
Possible RavenDB index is corruption, to fix this:
- Put ServiceControl in Maintenance Mode.
- Run the following in curl:
curl -X RESET http://localhost:33333/storage/indexes/FailureGroupsViewIndex
This assumes ServiceControl is using the default port and host name, if this is not the case, adjust the url accordingly.
After a period of inactivity, a web application endpoint is failing with the message:
Endpoint has failed to send expected heartbeat to ServiceControl. It is possible that the endpoint could be down or is unresponsive. If this condition persists restart the endpoint.
When accessed, the web application is operating as expected. However shortly after accessing the web application, the Heartbeat message is restored and indicates the endpoint status as active.
The issue is due to the way IIS handles application pools. By default after a certain period of inactivity the application pool is stopped, or, under certain configurable conditions, the application pool is recycled. In both cases the ServicePulse heartbeat is not sent anymore until a new web request comes in waking up the web application.
There are two ways to avoid the issue:
- Configuring IIS to avoid recycling
- Use a periodic warm-up HTTP GET to make sure the website is not brought down due to inactivity (the frequency needs to be less than 20 minutes, which is the default IIS recycle-on-idle time)
Starting from IIS 7.5 and above the above steps can be combined into one by following these steps:
- Enable AlwaysRunning mode for the application pool of the site. Go to the application pool management, open the Advanced Settings in General switch the
- Enabled Preload for the site itself. Right click on the site, then Manage Site in Advanced Settings in the General settings, switch
- Install the Application Initialization Module.
- Add the following to the web.config in the system.webServer node.
<applicationInitialization doAppInitAfterRestart="true" > <add initializationPage="/" /> </applicationInitialization>
In some cases configuring IIS to avoid recycling is not possible. In these cases, the recommended approach is the second one. It also has the side benefit of avoiding the "first user after idle time" wake-up response-time hit.
This may occur when an endpoint is re-deployed or updated to a different installation path (a common procedure by various deployment managers like Octopus etc.).
The installation path of an endpoint is used by ServiceControl and ServicePulse as the default mechanism for generating the unique Id of an endpoint. Changing the installation path of the endpoint affects the generated Id, and causes the system to identify the endpoint as a new and different endpoint.
To workaround this issue see Override host identifier.
After enabling Heartbeat plugins for Version 3 endpoints, ServicePulse reports that endpoints are inactive
Messages that were forwarded to the audit queue by NServiceBus Version 3.x version of the endpoints did not have the
HostId header available which uniquely identifies the endpoint. Adding the heartbeat plugin for Version 3 endpoints automatically enriches the headers with this
HostId information using a message mutator. Since the original message that was processed from the audit/error queue did not have this identifier, it is hard to correlate the messages received via the heartbeat that these belong to the same endpoint. Therefore there appears to be a discrepancy in the Endpoints Indicator.
To workaround this issue in order to monitor Version 3 endpoints:
- Add the heartbeat plugin to all Version 3 endpoints, which will add the requisite header with the host information, which ServiceControl can then process.
- Restart ServiceControl to clear the endpoint counter.