RavenDB Persistence Saga Concurrency

Component: RavenDB Persistence
NuGet Package NServiceBus.RavenDB (6.4)
Target NServiceBus Version: 7.x
RavenDB's implementation of distributed transactions contains a bug that could cause an endpoint, in certain (rare) conditions, to lose data. If RavenDB is configured to enlist in distributed transactions, read DTC not supported for RavenDB Persistence.

Default behavior

When simultaneously handling messages, conflicts may occur. See below for examples of the exceptions which are thrown. Saga concurrency explains how these conflicts are handled, and contains guidance for high-load scenarios.

Creating saga data

Example exception:

Raven.Client.Exceptions.ConcurrencyException: Document OrderSagaData/OrderId/316414b3-07f1-40ec-00db-022a4140d517 has change vector A:2-u2LvKAFZTE+972x2hp1gTg, but Put was called with expecting new document. Optimistic concurrency violation, transaction will be aborted.

Updating or deleting saga data

By default, RavenDB persistence uses optimistic concurrency control when updating or deleting saga data, though starting with NServiceBus.RavenDB version 6.4, it's possible to consider the persister to use pessmimistic locking. See later in this document for how to do this.

When a message handler does not change saga data, the RavenDB client will not attempt to write the associated document to storage. If a consistency check is required, a property value must be changed. For example, a counter property may be incremented.

Example exception:

Raven.Client.Exceptions.ConcurrencyException: Document OrderSagaDatas/f23921c9-7b53-455d-89be-aad200d98741 has change vector A:93-u2LvKAFZTE+972x2hp1gTg, but Put was called with change vector A:90-u2LvKAFZTE+972x2hp1gTg. Optimistic concurrency violation, transaction will be aborted.
This means that the relevant Handle method on the saga will be invoked, even though the message might be later rolled back. Hence it is important to ensure not to perform any work in saga handlers that can't roll back together with the message. This also means that should there be high levels of concurrency there will be N-1 rollbacks where N is the number of concurrent messages. This can cause throughput issues and might require design changes.

Sagas pessimistic locking

Starting with NServiceBus.RavenDB version 6.4, it's possible to configure saga persistence to use pessimistic locking instead of the default optimistic concurrency control.

RavenDB does not provide pessimistic locking natively. The behavior is based a spin lock that tries to acquire a lease on a resource.

Applying a spin lock over a remote resource is not as expensive as it may sound. When using optimistic concurrency control the recovery mechanism will result in all message processing being performed again for each retry including the retrieval of the message from the queue.

Choose pessimistic locking over optimistic locking if the system is experiencing optimistic concurrency control errors when saga instances are updated. Optimistic concurrency control is the most efficient form of processing if the system is occasionally experiencing an optimistic concurrency control error.

var sagasConfig = endpointConfiguration.UsePersistence<RavenDBPersistence>()
    .Sagas();
sagasConfig.UsePessimisticLocking();

The pessimistick locking behavior can be customized using the following options:

Pessimistic Lease Lock Time:

By default, the persister locks a saga data document for 60 seconds. It is not recommended to have long-running handlers in sagas but it might sometimes be required to increase the lease duration.

The lease duration can be adjusted using the following API:

var sagasConfig = endpointConfiguration.UsePersistence<RavenDBPersistence>()
    .Sagas();
sagasConfig.SetPessimisticLeaseLockAcquisitionMaximumRefreshDelay(TimeSpan.FromMilliseconds(500));

Pessimistic Lease Lock Acquisition Timeout

By default the persister waits 60 seconds to obtain a lease lock. If the lock acquisition fails, the message goes through the endpoint configured retry logic.

The behavior of obtaining a lease lock is based on competing on the document for update. This can result in a large increase in IO roundtrips, especially if many instances are competing for this resource.

The pessimistic lease lock acquisition timeout duration can be adjusted with the following API:

var sagasConfig = endpointConfiguration.UsePersistence<RavenDBPersistence>()
    .Sagas();
sagasConfig.SetPessimisticLeaseLockAcquisitionTimeout(TimeSpan.FromSeconds(15));

Pessimistic Lease Lock Acquisition Maximum Refresh Delay

To prevent jittering, the saga persister waits a random number of milliseconds between lease lock acquisition attempts. By default, the random waiting time is between zero and 20 milliseconds. The upper bound can be configured: the supplied value must be greater than zero and less than or equal to 1 second.

The pessimistic lease lock acquisition maximum refresh delay can be adjusted via the following API:

var sagasConfig = endpointConfiguration.UsePersistence<RavenDBPersistence>()
    .Sagas();
sagasConfig.SetPessimisticLeaseLockTime(TimeSpan.FromMinutes(2));

Related Articles

  • Saga concurrency
    NServiceBus ensures consistency between saga state and messaging.

Last modified