Skip to content

Error with Pub/Sub starting with StackExchange.Redis 3.0.0 and Garnet #3111

Description

@Dave-Senn

I'm not sure if this is a Garnet or StackExchange.Redis issue, but the problem started occurring after an update of StackExchange.Redis from 2.13.17.0 to 3.0.0.0.

I have an application which runs a Garnet server using the NuGet package:

public sealed partial class GarnetWorker(
    ILogger<GarnetWorker> logger,
    IOptionsMonitor<CacheOptions> cacheOptions,
    ILoggerFactory loggerFactory ) : BackgroundService
{
    protected override async Task ExecuteAsync( CancellationToken stoppingToken )
    {
        await Task.Yield();

        try
        {
            var garnetOptions = new GarnetServerOptions( logger )
            {
                MemorySize = cacheOptions.CurrentValue.MemorySize,
                EndPoints =
                [
                    new IPEndPoint( IPAddress.Parse( cacheOptions.CurrentValue.Address ), cacheOptions.CurrentValue.Port )
                ]
            };

            using var server = cacheOptions.CurrentValue.PreferCmdArgs
                ? new( Environment.GetCommandLineArgs(), loggerFactory )
                : new GarnetServer( garnetOptions, loggerFactory );

            server.Start();

            await Task.Delay( Timeout.Infinite, stoppingToken );
        }
        catch ( OperationCanceledException )
        {
            logger.LogInformation( "Service stopped." );
        }
        catch ( Exception ex )
        {
            logger.LogInformation( ex, "Unexpected error in cache service." );
        }
    }
}

public sealed class CacheOptions
{
    public Int32 Port { get; init; } = 48100;
    public String Address { get; init; } = "0.0.0.0";
    public String MemorySize { get; init; } = "8g";
    public Boolean PreferCmdArgs { get; init; }
}

In a unit tests I create an instance of the worker, start the garnet server and interact with it using StackExchange.Redis:

 [Fact]
 public async Task IntegrationTest()
 {
     var port = TestHelper.GetAvailablePort( 48100 );
     var settings = new CacheOptions
     {
         Address = "0.0.0.0",
         Port = port,
         MemorySize = "128mb"
     };

     var options = TestHelper.GetOptionsMonitor( settings );

     var cts = new CancellationTokenSource();

     var service = new GarnetWorker( TestHelper.GetLogger<GarnetWorker>(),
     options,
     TestHelper.GetILoggerFactory(),
     new Mock<ICommandArgsProvider>().Object );

     var testTask = service.StartAsync( cts.Token );

     var redis = await ConnectionMultiplexer.ConnectAsync( $"localhost:{port},abortConnect=false,connectTimeout=60000,responseTimeout=60000,syncTimeout=60000,asyncTimeout=60000" );
     var db = redis.GetDatabase();

     var key = RandomValueEx.GetRandomString();
     var expectedValue = RandomValueEx.GetRandomString();
     db.StringSet( key, expectedValue );
     var actualValue = db.StringGet( key );
     Assert.Equal( expectedValue, actualValue );

     var channelName = RandomValueEx.GetRandomString();
     var sub = redis.GetSubscriber();
     var subWait = new ManualResetEventSlim( false );
     await sub.SubscribeAsync( RedisChannel.Literal( channelName ),
     ( _, message ) =>
     {
         Assert.Equal( expectedValue, (String)message! );
         subWait.Set();
     } );

     await sub.PublishAsync( RedisChannel.Literal( channelName ), expectedValue );

     if ( !subWait.Wait( 2.ToMinutes(), TestContext.Current.CancellationToken ) )
         Assert.Fail( "Timeout :(" );

     await cts.CancelAsync();
     await testTask;
 }

After the update to 3.0.0 this tests started to fail. On the Garnet side i see the following error:

012  10:22:44.3217  !Error  An error occurred at RecvEventArg_Completed  [GarnetServer]
System.Threading.SynchronizationLockException: The calling thread does not hold the lock.
   at System.Threading.SpinLock.ExitSlowPath(Boolean useMemoryBarrier)
   at Garnet.common.GarnetTcpNetworkSender.ExitAndReturnResponseObject() in /_/libs/common/Networking/GarnetTcpNetworkSender.cs:line 150
   at Garnet.server.RespServerSession.TryConsumeMessages(Byte* reqBuffer, Int32 bytesReceived) in /_/libs/server/Resp/RespServerSession.cs:line 540
   at Garnet.networking.NetworkHandler`2.TryProcessRequest() in /_/libs/common/Networking/NetworkHandler.cs:line 506
   at Garnet.networking.NetworkHandler`2.Process() in /_/libs/common/Networking/NetworkHandler.cs:line 353
   at Garnet.networking.NetworkHandler`2.OnNetworkReceiveWithoutTLS(Int32 bytesTransferred) in /_/libs/common/Networking/NetworkHandler.cs:line 280
   at Garnet.common.TcpNetworkHandlerBase`2.HandleReceiveWithoutTLS(Object sender, SocketAsyncEventArgs e) in /_/libs/common/Networking/TcpNetworkHandlerBase.cs:line 224    at System.Threading.SpinLock.ExitSlowPath(Boolean useMemoryBarrier)
   at Garnet.common.GarnetTcpNetworkSender.ExitAndReturnResponseObject() in /_/libs/common/Networking/GarnetTcpNetworkSender.cs:line 150
   at Garnet.server.RespServerSession.TryConsumeMessages(Byte* reqBuffer, Int32 bytesReceived) in /_/libs/server/Resp/RespServerSession.cs:line 540
   at Garnet.networking.NetworkHandler`2.TryProcessRequest() in /_/libs/common/Networking/NetworkHandler.cs:line 506
   at Garnet.networking.NetworkHandler`2.Process() in /_/libs/common/Networking/NetworkHandler.cs:line 353
   at Garnet.networking.NetworkHandler`2.OnNetworkReceiveWithoutTLS(Int32 bytesTransferred) in /_/libs/common/Networking/NetworkHandler.cs:line 280
   at Garnet.common.TcpNetworkHandlerBase`2.HandleReceiveWithoutTLS(Object sender, SocketAsyncEventArgs e) in /_/libs/common/Networking/TcpNetworkHandlerBase.cs:line 224

And on the Client side get the following error:

StackExchange.Redis.RedisConnectionException
  HResult=0x80131500
  Message=SocketClosed on localhost:48100/Interactive, Idle/RanToCompletion, last: PUBLISH, origin: ReadAllAsync, outstanding: 1, last-read: 0s ago, last-write: 0s ago, unanswered-write: 0s ago, keep-alive: 60s, state: ConnectedEstablished, in: 0, in-pipe: 0, last-heartbeat: never, last-mbeat: 2s ago, global: 2s ago, v: 3.0.0.11234
  Source=ConsoleApp1
  StackTrace:
   at Program.<<Main>$>d__0.MoveNext() in C:\Users\dsenn\Downloads\ConsoleApp1\ConsoleApp1\Program.cs:line 32
   at Program.<Main>(String[] args)

Here is the client code as standalone (not unit test):

using StackExchange.Redis;

const Int32 port = 48100;

var redis = await ConnectionMultiplexer.ConnectAsync($"localhost:{port},abortConnect=false,connectTimeout=60000,responseTimeout=60000,syncTimeout=60000,asyncTimeout=60000");
var db = redis.GetDatabase();

var key = Guid.NewGuid().ToString();
var expectedValue = Guid.NewGuid().ToString();
db.StringSet(key, expectedValue);
var actualValue = db.StringGet(key);

var channelName = Guid.NewGuid().ToString();
var sub = redis.GetSubscriber();
var subWait = new ManualResetEventSlim(false);
await sub.SubscribeAsync(RedisChannel.Literal(channelName),
                          (_, message) =>
                          {
                              Console.WriteLine($"{expectedValue} = {message}");
                              subWait.Set();
                          });

await sub.PublishAsync(RedisChannel.Literal(channelName), expectedValue);
subWait.Wait();

While searching for details about the problem I found this issue from you: microsoft/garnet#1615
Based on that issue it looks like a Garnet problem....but since the issue only started occurring after the update to 3.0.0 I'm not sure.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions