Script-Only HTTP Servers?
I’ve been thinking about and looking for a script-only http server. That is, a server that’s optimized for producing dynamic content, invoking a script on every invocation.
I tend to think of the serving of static and dynamic content as different things. One is basically a file system while the other is a service provider.
In a past life I spent some time as a Web app performance optimization “expert”. One of the basic techniques was to separate the serving of static from dynamic content. You could do this by setting up a bank of servers to serve your images, style sheets, and html pages, and another to serve scripts. Many good reasons, including:
- Different caching characteristics, allowing the use of caches and distributed caches for the static content
- Different memory usage characteristics. Static content can typically be served with lean and mean servers taking up little memory (and memory is an important gating factor in how many simultaneous clients you can serve), while dynamic content requires fatter servers
There are other practical matters. For example, I have access to several servers with high bandwidth that I can serve static content from, but only one server with limited bandwidth that I can serve dynamic content from. This is an easy situation to get into: you can get very cheap hosting of static content, but dynamic content costs you more, and your choice of how you serve dynamic content (ie. root access) costs you even more.
Which brings me to the point: I was expecting that by now the major scripting languages would offer their own built-in http servers, if for no reason other than convenience. I see that that’s not the case - the assumption is still that the scripting engine will be plugged into Apache, ala mod_perl, mod_python, etc. This means the servers are all fat and there is no separation of dynamic and static.
Meanwhile, you have fastcgi, which basically is a way of setting up a separate bank of servers for dynamic content. The request goes from Apache, thru fastcgi, to the dynamic execution engines. Ruby on Rails, for example, makes use of this. It’s nice because it gives you the separation, allows you to scale your script engines separately from your static engines, and makes your module pluggable into any http server that supports fastcgi.
But fastcgi is an extra layer I’d like to get rid of. Why not have a native server built into the scripting language, obviating the need for fastcgi? What value is gained by having dynamic requests go thru Apache on their way to my serving engines?
Well, there’s the resolution of the URL to the script to be executed. Apache figures out what needs to be invoked and invokes it.
But is that a good thing? Check out mod_rewrite. Check out the forums on cherrypy. People want more control over how the URL is translated to what’s invoked. We are living in a REST world after all, and parsing the URL is very meaningful.
To be fair, Apache offers a lot. It takes care of everything for you so you only have to write your scripts. It’s standard, it’s everywhere, and people know how to operate and tune it.
Still, I’m convinced it’s cleaner to have a separation. Have the request hit my scripting engine from the start, skip Apache all together. Give me full control over URL parsing.
Ruby On Rails, as usual, appears to have the lead in this. They ship with a built-in http server, which means you can get up and running in no time. They’re also offering lighttpd as a choice, meaning you can skip Apache. They continue to use fastcgi, which is probably a fine thing to do.
Manage your expenses via Email, SMS, Twitter, Voice (Jott: Call and say your expense), IM (Yahoo, AIM, MSN), or Web.
Apache is still better to manage the processes and connections. There’s a lot of security things in there that you’d have to duplicate in your language, and then why?
With mod_perl, you also can have complete entire control over the processing. I hate mod_rewrite. Instead, I use Perl subroutines to do URL translation during that phase. The important thing to remember is that mod_perl is not just a content delivery system: it is a system for scripting EVERY phase of a request. I know people who have ripped out every single mod_* EXCEPT for mod_perl, and ran their server that way… letting nice fast C code manage the connections and processes, and Perl to do everything else.
Randal, excellent point, running Apache with mod_perl only and having it do script serving gets me what I want - a fast, no overhead script server. I’m not familiar with having Perl do URL translation, but I have no doubt it’d be better than mod_rewrite.
Maybe my issue is just semantics and packaging - the environment you describe really requires an expert to setup. Your average web developer is not going to be doing that. I’d like to have Perl come pre-packaged with what you describe.
In other words, I’d like to assume http addressability as a capability of Perl (Python, etc), as opposed to assume Perl as a plug-in to Apache. Perl does a lot more for me than Apache; http should just be another way it can be invoked.
Well there is a big difference between mod_perl and your other scripting mod_*s like mod_python and mod_fastcgi, and that is mod_perl gives you access into the Apache APIs. This effectively gives you full control of the Apache webserver from within your mod_perl application. That’s full control over the entire Apache lifecycle from TCP/IP connection, configuration, how URL/URIs are translated, how URL/URIs are mapped to the filesystem, how things are logged, etc, etc.
So while mod_perl *is* put into Apache, it really has all of the same control features you would get with a standalone special Perl server.
There is also a technique you can use to get the best of both worlds on the same server. You can use a non-mod_perl Apache to serve all of your static content and then map specifc URIs with mod_proxy and have the dynamic portions served by another mod_perl enabled Apache running in the background. This gives you practically the same setup as the FastCGI setup your described above, but without having to using anything other than Apache and you get to have the full “power” of mod_perl available to you.
You might be interested in this article I wrote last year that discusses some of the cool/useful features of mod_perl 2.0 that aren’t as widely known as the normal “speed up your CGI” features.
It also shows describes what Randall was talking about for doing mod_rewrite in a Perl space. Hope you find it useful.
Frank, I didn’t realize the extent of control mod_perl gives you. What you describe is equivalent to the Perl server I’m looking for. Interesting.
Interesting, but you probably know by now that Python has this covered - BaseHTTPServer (and subclasses SimpleHTTPServer and CGIHTTPServer) is in the standard library. I’ve seen a bunch of Python packages that use it (usually by subclassing, e.g., for FastCGI or WSGI) to provide standalone HTTP servers. In fact, I can’t think of any Python-based web packages that don’t provide a BaseHTTPServer based standalone mode for development and testing.
A frontend server is still useful, though, for various reasons.