Saga Concurrency

Component: NServiceBus
NuGet Package NServiceBus (7.x - 7.1)

If the endpoint is configured to allow concurrent processing of messages (default) or is scaled out, it is possible that multiple messages will hit the same saga instance simultaneously. To give ACID semantics in this situation, NServiceBus uses the underlying storage to produce consistent behavior, only allowing one of messages to complete. NServiceBus handles most of this automatically but there are some caveats.

Concurrent access to saga instances is divided into two scenarios;

  • Concurrently trying to create the same instance of a new saga.
  • Concurrently trying to update the same instance of an existing saga.

Each saga persister will honor the following semantics, see the specific persister documentation for implementation details.

Concurrent access to non-existing saga instances

Sagas are started by the message types that a handled with IAmStartedByMessages<T>. If messages mapped to the same saga instance are processed concurrently there is a risk that duplicates of the instance will be created.

In this case only one message is allowed to complete processing. The others roll back and the built-in retries in NServiceBus kick in. On the next retry, the saga instance is found, the race condition is solved, and that saga instance is updated instead, see below.

Concurrent access to existing saga instances

When messages concurrently tries to update the same saga instance the storage will either detect and throw a concurrency exception or serialize access to the instance. Concurrency exceptions will be automatically resolved by the NServiceBus retries will.

Another option is to use a transaction isolation level of serializable but that causes excessive locking with considerable performance degradation.

While Serializable is the default isolation level for TransactionScopes, in NServiceBus Version 4 and higher the isolation level will default to ReadCommitted.

High load scenarios

Under extreme high load like batch processing, concurrent access to saga instance might lead to messages being moved to the error queue due to the NServiceBus retries being exhausted.

In that scenario consider re-designing the process.

Take a look at Jimmy Bogard's blog about Reducing Saga load

Related Articles

Last modified