Anatomy of a Ruby Web Application

January 18, 2010 — Code

So, you’ve heard about some fancy (and supposedly fast) deployment strategy for Rails applications and you want to try it out, but it’s sounds complicated. Application servers? Rack? Reverse proxies? What are these things and how do they all fit together?

Let’s start exploring at the bottom of the stack, with our Rails (or other) application code, and work our way up. I’ll give names of various libraries and servers along the way, but this is not a tutorial on setting up Mongrel with Nginx, or any other configuration. This article is about understanding Ruby web application deployment in general, so you are better informed and more able to evaluate new technologies and participate in conversations.

Application Code

At the bottom of the stack is your application code. It may be written for a web framework like Rails, Sinatra, or Camping, or it may be a simple stand-alone script. The key point is that, while your code describes the behavior of a web site, there is usually no obvious way to “run” the code by itself. Sure, if you’re using Rails you can type ruby script/server and you can see it in your browser at localhost:3000, but you didn’t write the code that makes this happen, there’s something else (WEBrick, Mongrel, etc) which is part of your framework that does this: a Ruby web server.

Ruby Web Server

You’re probably used to thinking of Apache, Lighttpd, and Nginx when you think of web servers, but what is a web server in a generic sense? A “server” is a daemon (program running in the background) that communicates with clients, and “web” indicates the language of the world wide web: HTTP. A Ruby web server uses a Ruby application to generate its responses (it “serves the application”).

More succinctly: a Ruby web server is a daemon that listens for HTTP requests, gives them to your Ruby application, and responds with a web page (the output of your app).

It may be educational to see this in action with a very simple app. If you have Mongrel installed, save this script as app.rb:


require 'rubygems'  
require 'mongrel'  
  
class BareApp < Mongrel::HttpHandler  
  def process(request, response)  
    response.start(200) do |head,out|  
      out << request.params.inspect
    end  
  end  
end  
  
h = Mongrel::HttpServer.new("0.0.0.0", "5000")  
h.register("/", BareApp.new)  
h.run.join

Then type ruby app.rb and visit http://localhost:5000 in your web browser. You should see a dump of the HTTP request parameters sent by your browser (generated by request.params.inspect) with names like REQUEST_METHOD, PATH_INFO, QUERY_STRING, etc. You just ran what is known as a “bare Mongrel handler”—an extremely simple application served by Mongrel.

The exact same thing happens when a complex Rails application is served by Mongrel: your application code is executed within in the process method of a Mongrel::HttpHandler instance. This isn’t an oversimplification: the next time you get a development error page in a Mongrel-served Rails application, take a look at the full backtrace; somewhere near the bottom you’ll see something like .../mongrel.rb:64:in `process'.

So, Mongrel is a Ruby web server, or a Ruby app server. Some other Ruby app servers are:

As seen in the above example, if our web application is to be served by Mongrel, it needs to know about Mongrel’s process method. If it is to be served by a different app server, it needs to know how to interact with that server.

Rack

The first line of the Rack interface specification is very readable:

A Rack application is an Ruby object (not a class) that responds to call. It takes exactly one argument, the environment and returns an Array of exactly three values: The status, the headers, and the body.

That’s an awfully good summary of what Rack does. The environment parameter is that hash we displayed in our bare Mongrel handler above, and the call method is like Mongrel’s process.

I’ve gone out of order here—Rack fits in between your code and the app server. It basically turns your application into the object described in the Rack interface specification and passes input/output from/to the web server so that your application/framework doesn’t need to know how the app server works.

Rack is a piece of software, but it’s small, and its main purpose is to enforce an interface specification. (Rack is similar to Python’s WSGI.)

Another Web Server?

At this point you may be wondering why people are talking about using Apache or Nginx in their stack. Don’t we already have a web server? We do, but the Ruby app server is not a full-featured web server, and it’s not very good at sending images and other kinds of static files (in fact all app servers are terrible at this, which is why it’s better to call them “app servers” than “web servers,” though the two terms are often used interchangeably). So what we want to do is use a general purpose web server for static files, and pass other requests along to our app server.

This is one of the most complicated and least standardized parts of Ruby web app configuration. Serving static files and handing off requests doesn’t take much processor power, so for a small web site our web server shouldn’t be too busy. The app server, on the other hand, is running our complicated Ruby application, and could be quite busy. For this reason one usually deploys multiple app servers (listening on different ports) for the web server to choose from when a request comes in. The process of handing requests off to these servers is called reverse proxy.

(If you’ve worked for a large company you’re probably familiar with accessing the Internet through a forward proxy server. Forward proxy is for outbound—client-side—requests, reverse proxy is for inbound—server-side—requests.)

One of the most difficult parts of a reverse proxy server’s job is load balancing: handing off a given request to the most available app server. How this works in various setups is beyond the scope of this article, but you should know that there are many solutions to this problem, and none of the good ones are as trivial as you might think at first.

Learning More

To learn more you might want to read about GitHub’s setup. It’s fairly complicated due to their need to work extensively with the filesystem, but their choices of technology are thoroughly researched and very smart, and the article is fairly readable.

Thanks to Phusion Passenger (easy-to-install app server, reverse proxy, and load balancer in one), it is no longer necessary to fully understand the intricacies of Ruby web application deployment. However, there is a lot of interesting technology out there, and I believe that some investigation will enhance your appreciation for web programming, and make you a more complete developer. Plus, you never know when you’ll outgrow Passenger!


comments powered by Disqus