Build Your Own Redis: Concurrent Clients [3/4]

This is the third article in a series where we’ll build a toy Redis clone in Ruby. If you’d like to code-along, try the Build your own Redis challenge!

Previous article: Ping <-> Pong

Next article: ECHO

Sections in this article:

Bug: Multiple commands from the same client
Using threads to serve multiple clients
Event Loops: an analogy
Blocking vs. Non-blocking calls
IO#Select
Implementing the event loop

Bug: Multiple commands from the same client

In the previous article, we worked on replying to the PING command.

Here’s what the code for our server looked like:

require "socket"

class RedisServer
  def initialize(port)
    @server = TCPServer.new(port)
  end

  def listen
    loop do
      client = @server.accept
      # TODO: Handle commands other than PING
      client.write("+PONG\r\n")
    end
  end
end

Re-read the code above, and try to answer this: What’ll happen when a client sends their second PING command?

Here’s an integrated test for that question:

  require "redis"
  require "minitest/autorun"
  
  # 6379 for official redis, 6380 for ours
  SERVER_PORT = ENV["SERVER_PORT"]
  
  class TestRedisServer < Minitest::Test
    def test_responds_to_ping
      r = Redis.new(port: SERVER_PORT)
      assert_equal "PONG", r.ping
    end
+ 
+   def test_multiple_commands_from_same_client
+     r = Redis.new(port: SERVER_PORT)
+ 
+     # The Redis client re-connects on timeout by default, without_reconnect
+     # prevents that.
+     r.without_reconnect do
+       assert_equal "PONG", r.ping
+       assert_equal "PONG", r.ping
+     end
+   end
  end

Ready? (If you haven’t taken time to think about the answer, do so now!)

Here’re the test results:

➜  SERVER_PORT=6380 ruby _files/redis/integrated_test_2.rb
Run options: --seed 32401

# Running:

E.

  1) Error:
TestRedisServer#test_multiple_commands_from_same_client:
Redis::TimeoutError: Connection timed out
    /home ... /ruby.rb:71:in `rescue in _read_from_socket'
    ...
    _files/redis/integrated_test_2.rb:18:in `test_multiple_commands_from_same_client'

2 runs, 2 assertions, 0 failures, 1 errors, 0 skips

The client doesn’t receive the second reply!

This happens because after the first @server.accept call, we send PONG and then forget about the client! When the client sends their next PING command, there’s no one listening on the other end.

Using threads to serve multiple clients

Let’s try another approach where we don’t abandon the client after sending PONG. Instead, we’ll keep listening to the client for further commands.

  require "socket"
  
  class RedisServer
    def initialize(port)
      @server = TCPServer.new(port)
    end
  
    def listen
      loop do
        client = @server.accept
+       handle_client(client)
+     end
+   end
+ 
+   def handle_client(client)
+     loop do
+       client.gets
+ 
        # TODO: Handle commands other than PING
        client.write("+PONG\r\n")
      end
    end
  end

This satisfies the use case of a single client that wants to send multiple commands.

There’s still a huge problem here though. When we’re busy servicing one client, we never end up calling @server.accept, so we don’t accept any new clients.

At a given point in time, we’ve got two things we need to be doing:

Accept new clients
Wait on commands from existing clients

Using multiple threads would solve this. Every time we receive a new client, we’ll spawn a new thread that’ll keep on listening to commands from the client until it disconnects.

Let’s quickly write up a test for this:

  require "redis"
  require "minitest/autorun"
  
  # 6379 for official redis, 6380 for ours
  SERVER_PORT = ENV["SERVER_PORT"]
  
  class TestRedisServer < Minitest::Test
    def test_responds_to_ping
      r = Redis.new(port: SERVER_PORT)
      assert_equal "PONG", r.ping
    end
  
    def test_multiple_commands_from_same_client
      r = Redis.new(port: SERVER_PORT)
  
      # The Redis client re-connects on timeout by default, without_reconnect
      # prevents that.
      r.without_reconnect do
        assert_equal "PONG", r.ping
        assert_equal "PONG", r.ping
      end
    end
+ 
+   def test_multiple_clients
+     r1 = Redis.new(port: SERVER_PORT)
+     r2 = Redis.new(port: SERVER_PORT)
+ 
+     assert_equal "PONG", r1.ping
+     assert_equal "PONG", r2.ping
+   end
  end

Followed by the implementation that uses threads:

  require "socket"
  
  class RedisServer
    def initialize(port)
      @server = TCPServer.new(port)
    end
  
    def listen
      loop do
        client = @server.accept
        handle_client(client)
      end
    end
  
    def handle_client(client)
-     loop do
-       client.gets
+     Thread.new do
+       loop do
+         client.gets
  
-       # TODO: Handle commands other than PING
-       client.write("+PONG\r\n")
+         # TODO: Handle commands other than PING
+         client.write("+PONG\r\n")
+       end
      end
    end
  end

This implementation satisfies all our tests:

➜  SERVER_PORT=6380 ruby _files/redis/integrated_test_3.rb
Run options: --seed 21893

# Running:

...

3 runs, 5 assertions, 0 failures, 0 errors, 0 skips

Event Loops: an analogy

Unlike our threaded implementation, the official Redis implementation serves multiple clients using a single thread, not multiple.¹

It achieves this by using an event loop.

Think of your program as a bartender who has to serve multiple clients. When you ask your operating system to start up another thread, you’re essentially asking it to create another bartender. As long as you have as many bartenders as clients, every client will be served.

There’s a certain amount of overhead involved here. Each thread (i.e. bartender) isn’t cheap to spawn, and more threads leads to more time spent switching between threads.

What if, instead of reserving one bartender per client, we tried to serve all clients with just one bartender? This should work, as long as the bartender manages to accept new customers and to serve existing ones in time. That’s what an event loop does.

Your program reacts to multiple events in a loop - each action tends to be so quick that you give away the ‘illusion’ of serving multiple clients at a time. What you really did though, is split up the work into chunks and executed them one-by-one.

Going back to the bartender analogy: Let’s say we decide to go with the single bartender approach. What happens if the bartender ends up chatting with a single customer for a long period of time?

Other customers won’t be serviced, and new customers won’t be accepted either. For this to run well, it is imperative that the bartender doesn’t spend too much time servicing a single customer.

This brings us to the cardinal rule of event loops: never block the event loop.

To run on an event loop, we’ll have to make sure that our program is capable of servicing each event quickly.

Blocking vs. Non-blocking calls

Before we get into implementing an event loop, let’s take a while to understand blocking vs. non-blocking calls.

➜  rohitpaulk.com git:(master) ✗ irb
irb(main):001:0> q = Queue.new
=> #<Thread::Queue:0x000055db24365740>
irb(main):002:0> q.push(1)
=> #<Thread::Queue:0x000055db24365740>
irb(main):003:0> q.pop
=> 1
irb(main):004:0> q.pop

.
..
... Blocked, Indefinitely!

A blocking call is one that only returns until the result is ready. The amount of time spent here can be indefinite, although in most cases it is likely that a timeout is configured.

If a blocking call ends up in our event loop, we’ll end up breaking the ‘never block the event loop’ rule!

Now that we know that blocking calls are deadly, let’s re-visit our server code:

  require "socket"
  
  class RedisServer
    def initialize(port)
      @server = TCPServer.new(port)
    end
  
    def listen
      loop do
+       # DANGER: blocks until a new client is ready to connect!
        client = @server.accept
        handle_client(client)
      end
    end
  
    def handle_client(client)
      Thread.new do
        loop do
+         # DANGER: blocks until a newline character is read!
          client.gets
  
          # TODO: Handle commands other than PING
          client.write("+PONG\r\n")
        end
      end
    end
  end

Here are the calls we’ve used so far that are blocking:

TCPServer#accept
- blocks until a new client is ready to connect
IO#gets
- blocks until a newline character is read

For each of these blocking calls, we’ll need to do one of the following:

Replace it with a non-blocking variant (like a method that returns nil when the result isn’t ready)
Only make the blocking call if we know (through some other mechanism) that the result will be ready (and hence the call won’t have to block)

IO#Select

Ruby’s IO#select (backed by the select syscall) can help in reducing these two calls into non-blocking versions.

IO#select takes in a list of file descriptors, and blocks until one or more of them is ready to be read from. It then returns the subset of file descriptors that are ready to be read from. Thanks to Unix’s “everything is a file” architecture, we can use this mechanism to wait on both a new client wanting to connect and an established socket that has new data to be read from.

# This call blocks until one or more of `fds_to_watch` is ready to read from
ready_to_read, _, _ = IO.select(fds_to_watch, _, _)

# Once the call returns, `ready_to_read` will contain a subset of `fds_to_watch`

Let’s see this in context of our TCP server:

serv = TCPServer.new(6380)
client1 = serv.accept
client2 = serv.accept

fds_to_watch = [serv, client1, client2]

# This blocks till either:
#
# - `serv` has another client ready to connect
# - `client1` has data ready to be read
# - `client2` has data ready to be read
ready_to_read, _, _ = IO.select(fds_to_watch, _, _)

ready_to_read.each do { |ready|
  if ready == serv
    # if the server is ready to read from,
    # that means that a new client is ready
    # to connect.
  else
    # If not the server, this must be either
    # `client1` or `client2`
  end
}

Implementing the event loop

First, let’s look at the events we want to listen on:

A new client is ready to connect
An existing client has new information to send

Whenever one of these events happen, we want to perform an action, which should be non-blocking.

  require "socket"
  
  class RedisServer
    def initialize(port)
      @server = TCPServer.new(port)
+     @clients = []
    end
  
    def listen
      loop do
-       client = @server.accept
-       handle_client(client)
+       fds_to_watch = [@server, *@clients]
+       ready_to_read, _, _ = IO.select(fds_to_watch)
+       ready_to_read.each do |ready|
+         if ready == @server
+           @clients << @server.accept
+         else
+           # If not the server, this must be one of the existing clients
+           handle_client(client)
+         end
+       end
      end
    end
  
    def handle_client(client)
-     Thread.new do
-       loop do
-         client.gets
+     client.readpartial(1024) # TODO: Read actual command
  
-         # TODO: Handle commands other than PING
-         client.write("+PONG\r\n")
-       end
-     end
+     # TODO: Handle commands other than PING
+     client.write("+PONG\r\n")
    end
  end

We’ve kept server.accept as-is, but we only call it when we know it is ready to read - so that shouldn’t result in a non-blocking call.

We’ve replaced IO.gets with IO.readpartial, which is a non-blocking variant. We’re still not parsing input though - we’ll re-visit this in the next article, when we actually decode Redis commands.

Running our tests to make sure they pass against this implementation…

➜  rohitpaulk.com git:(master) ✗ SERVER_PORT=6380 ruby _files/redis/integrated_test_3.rb
Run options: --seed 21893

# Running:

...

Finished in 0.006090s, 492.6094 runs/s, 821.0156 assertions/s.

3 runs, 5 assertions, 0 failures, 0 errors, 0 skips

And they do!

In the next article, we’ll delve into parsing RESP and implement the ECHO command.

Not entirely accurate, Redis does use threads for some background tasks ↩