Concurrency in Ruby: Thread and Fiber

Fibers and Threads
Example: HTTP request
Example: HTTP server
Fiber Scheduler
Concluding

The content of this article is my last tech sharing with my team at https://pixta.vn/.

Fibers and Threads

Thread

thread = Thread.new do
  #...
end
thread.join

Fiber

fiber = Fiber.new do
  #...
end
fiber.resume # transfer / Fiber.schedule

As you can see, they have quite similar syntax, so what are the differences between them?

The level:
- Threads are created 1:1 with threads on OS.
- Fibers are implemented at the programming language level, multiple fibers can run inside a thread.
Scheduling mechanism:
- Threads are run pre-emptive by almost modern OS.
- Fibers are referred to as a mechanism for cooperative concurrency.

Threads will run automatically, they are scheduled by OS.
With Thread, programmers are just allowed to create new Threads, make them do some tasks, and use the join method to get the return from execution. The OS will run threads and decide when to run and pause to achieve concurrency.

[
  Thread.new { # code },
  Thread.new { # code }
].each(&:join)

Meanwhile, Fiber gives us more control
With Fiber, programmers are free to start, pause, and resume them.

Fiber.new { } : create new fiber, started with resume
Fiber.yield: pause current Fiber, moves control to where fiber was resumed
After suspension, Fiber can be resumed later at the same point with the same execution state.

fib2 = nil

fib = Fiber.new do
  puts "1 - fib started"
  fib2.transfer
  Fiber.yield
  puts "3 - fib resumed"
end

fib2 = Fiber.new do
  puts "2 - control moved to fib2"
  fib.transfer
end

fib.resume
puts ""
fib.resume

1 - fib started
2 - control moved to fib2

3 - fib resumed

Fiber over Thread

A fiber is lighter-weight than a thread, so we can spawn more fibers than threads
Less context-switching time ( the advantages of cooperative scheduling compare to preemptive scheduling

Fiber scheduler

Fibers were released in Ruby 1.9, but before Ruby 3, Fibers lacked the scheduler implementation to be useful.. Now it is officially supported from Ruby 3.
The Fiber Scheduler consists of two parts:

Fiber Scheduler interface ( what ruby 3 implements )
Fiber Scheduler implementation

If you want to enable the asynchronous behavior in Ruby, you need to set a Fiber Scheduler object.

Fiber.set_scheduler(scheduler)

The list of Fiber Scheduler implementations and their main differences can be found at Fiber Scheduler List project.

Async gem

One of the most mature and common Fiber Scheduler implementations is by Samuel Williams.
Furthermore, he not only implemented a Fiber Scheduler but created the gem called Async has the robust API to write concurrency code.

The next part will help you understand more about how to use Thread, Fiber, and Async gem to write concurrent HTTP requests.

HTTP requests example

For example, we will get a list of uuid from this site

require "net/http"

def get_uuid
  url = "https://httpbin.org/uuid"
  response = Net::HTTP.get(URI(url))
  JSON.parse(response)["uuid"]
end

This request will take about 1s to finish.

Sequentially version

def get_http_sequently
  results = []

  10.times.map do
    results << get_uuid
  end

  results
end

now = Time.now
puts get_http_sequently
puts "Fiber runtime: #{Time.now - now}" # about 11-12s

One request took about 1s so if we call sequentially, this code will take about 10s.

Concurrency version with thread

def get_http_via_threads
  results = []

  10.times.map do
    Thread.new do
      results << get_uuid
    end
  end.map(&:value)

  results
end
# => 1.3s

Concurrency version with fiber

require "async"

def get_http_via_fibers
  Fiber.set_scheduler(Async::Scheduler.new)
  results = []

  10.times do
    Fiber.schedule do
      results << get_uuid
    end
  end
  results
ensure
  Fiber.set_scheduler(nil)
end
# => 1.2s

Because all requests are called concurrently, the total time is about the time of the slowest request.

More about Async

Another implementation uses Async gem like that, we use Kernel#Async method instead of Async::Scheduler

def get_http_via_async
  results = []

  Async do
    10.times do
      Async do
        results << get_uuid
      end
    end
  end
  results
end

The general structure of Async Ruby programs:

You always start with an Async block which is passed a task.
That main task is usually used to spawn more Async tasks with task.async.
These tasks run concurrently with each other and the main task.

The task is built on top of each Fiber.

HTTP server example

The minimal HTTP server in Ruby can be implemented by using the built-in class TCPServer, it'll look like this:

socket = TCPServer.new(HOST, PORT)
socket.listen(SOCKET_READ_BACKLOG)

loop do
  conn = socket.accept # wait for a client to connect
  request = RequestParser.call(conn)
  #... status, headers, body
end

Now we'll make the server handle more than 1 request per time.

Thread pool version

pool = ThreadPool.new(size: 5)
loop do
  conn = socket.accept # wait for a client to connect
  pool.schedule do
    # handle each request
    request = RequestParser.call(conn)
  end
end

The idea is to use a thread pool to limit the number of threads running concurrently.

Async version

Async do
  loop do
    conn = socket.accept # wait for a client to connect
    Async do
      # handle each request
      request = RequestParser.call(conn)
    end
   end
end

The Falcon is the most-known app server that uses async for connection pool Falcon.

More detail about implementation and benchmark testing on this repo

Concluding

Threads and fibers allow programmers to write concurrent code, it's very useful for handling blocking-IO operations.
As a Ruby developer, we don't use Thread directly most of the time. But in reality, for web development, a lot of tools use threads.
- A web server like Puma or Webrick
- A background job like Sidekiq, GoodJob, and SolidQueue
- An ORM like ActiveRecord or Sequel
- A Http client HTTParty or RestClient
Fiber (+ FiberScheduler) is just been released from Ruby 3 maybe may have a bright future due to its advantages compared to Thread. Here's a couple of the most useful tools on top of fiber:
- async-http a featureful HTTP client
- falcon HTTP server built around Async core
- ...