Playing Around With Web Servers

Last modified on March 9th, 2009

Obviously I’ve been spending a bit of time tweaking Apache such that it is well optimized for working in a VPS environment. Truthfully though, Apache is a major resource hog. To be honest, I really can’t comprehend what it’s doing with all that memory it seems to hoard. One of the other major downsides to Apache is that it forks additional processes to handle server load, and processes are rather heavy (at least when compared to user threads).

Since I haven’t touched any C++ in about two months, I thought I’d try my hand at writing a quick web server to see if I can better understand some of problems. For now, I’m leveraging boost for all the threading, and just using simple select now for all the socket communication (I understand the limitations of select, and will probably move things to boost::asio when it’s all done).

I originally wrote it using a half-async pattern, but found that I was having problems getting the threading the way I wanted without having lock contention going on between threads. So I’ve pretty much abandoned that in favor of a fully asynchronous architecture, which is trickier to handle but ultimately gets rid of nearly all of the locks. I still haven’t really figured out the best way to handle memory, but obviously some type of pool is going to be required for high efficiency. One of the areas I spent a lot of time with at my last job was optimizing code for speed, so I’m hoping I can bring some of that to the web space problem as well.

In terms of the actual server, right now there’s a main thread handling all the dispatching and communications, and a pool of worker threads handling the actual web traffic. It’s a typical web traffic architecture, and I plan to pick it apart when I’m done to see if it actually makes sense.

In about four hours in front of the TV yesterday I managed to get the web server (affectionately called Webby right now) to serve static content. I’m using four threads right now for the worker pool, but it’s completely configurable (I’m guessing around four threads per core is probably ideal). Once I had that working, I decided to take a stab at getting PHP code to work. I did some basic stress testing at this point (no point in continuing if I can’t blow Apache away). I did 10,000 requests for static content using Apache, and 10,000 requests for the same content using Webby. Apache took 42.5 seconds to serve a static HTML file over the LAN, which amounts to around 235 individual requests per second. Webby did it in about 3.28 seconds.

Apache has an distinct advantage because PHP can be built as a module for it, basically loading PHP into Apache. Given that this is just something I’m doing for fun, I don’t really want to spend the time doing that at this point. So my only options at this point are to drop to the OS and execute PHP, or execute it as a CGI. I tried the former, and it was painfully slow. So that basically meant going the CGI route.

The best way to do CGI is using Fast CGI, which is something I’ve never done before. Basically you open a local socket to a server running PHP, and send it requests to process scripts. I ended up writing a mini Fast CGI library, since I couldn’t find one that was usable. It took me a few hours, but it seems to be working now with PHP.

I still have a few major optimizations to do before I can test out the PHP part properly (you have to initialize PHP with various variables before WordPress will work properly), but maybe next weekend I can set aside an afternoon and finish it off.

Apache’s major detriment is that it’s such a memory hog. Without a doubt, one of the major advantages to Webby in a low memory environment is that it won’t start trashing into swap space as soon as Apache will. I also will be able to adjust the stack size per thread (something that I haven’t done yet), so conceivably I could get the memory foot print even lower (although right now Webby is using around 1.5 MB of memory, whereas Apache is using around 120 MB I believe).

5 responses to “Playing Around With Web Servers”

  1. Trevor says:

    Hey Duane,

    I’m rather blown away by your results – especially serving 10,000 requests in 3 seconds.

    I think – before you dive too much further into writing your own server – you should spend time looking at all the configuration options for Apache. Yes it’s a memory hog but there are plenty of ways to manage that. Don’t forget that apache is probably doing a hell of a lot more than you think it’s doing. Look at caching, not having enough max_clients, etc. max requests per child will allow you to recycle processes after a certain number of requests.

    If you’re just serving static files I’d look at Nginx.

    You’re a super genius but 100’s of other people have already written web servers – you have bigger fish to fry :p

  2. Duane Storey says:

    Yah, I was just bored. I’ve played around with all those options on Apache, and they help some. I suspect lighttp woudl work a lot better, but making it work with WordPress is kind of a pain I’m told. It’s nothing I’m taking too seriously, I’ll probably toss it on a SVN repository at some point and let other people play around with it.

  3. Duane Storey says:

    Also, the main hit is typically with the PHP files. You can’t really cache them effectively. You can use opcode caches which help quite a bit, and WordPress has WP-Cache, but that is basically disabled for anyone that has logged in or has commented, so it doesn’t help everyone. You can also cache static files using Expires, but some browsers (*cough* Safari), don’t really do that properly. In all cases, you end up putting load on the webserver. Given that everyone pretty much still uses Apache even though it’s totally slow, I still think there’s room for improvement here 🙂

  4. Matt says:

    FYI, you should check out lighttpd and thttpd, among others. That would be my current rec for running a webserver in a low memory environment.

    Also, be sure to check to make sure you’re running a fair test. for example, Apache will check for .htaccess files if not configured NOT to – that will certainly slow things down.

  5. Duane Storey says:

    Good point about the .htaccess. Although, you basically have to have it on for WordPress, so not sure the best way to come up with a fair test. But I’ll try to take it into account.

Leave a Reply

Your email address will not be published. Required fields are marked *