Personal tools
You are here: Home Docs Help! How-tos Load Balancing Apache and Plone

Load Balancing Apache and Plone

2.33333333333

This How-to applies to: All
This How-to is intended for: Advanced Server Administrator

There are many solutions to this issue, this might be one of the neater ones ...

Purpose

With Plone and indeed other web application engines, it is possible to run a number of front-end servers in order to spread the load and effectively increase a web applications overall capacity. This implicitly provides a degree of fail-over should you need to take down one of the front-end servers or if one should fail.

Q. Why is this difficult, after all load balancing is a standard feature in Apache2?

A. Sure, however session multiplexing is not a standard feature and using raw load balancing will immediately break a Plone setup as soon as you try to log in.

The issue most people have is lack of access to the underlying Plone instances on the individual front-end servers, indeed on ISP based setups the individual instances are owned and run by third parties, so playing with login procedures in order to set session variables for Apache to use in load balancing is not really an option.

Prerequisities

You will need a number of Plone application servers (although this should work with *any* application server) and a recent copy of Apache 2.x. (I'm using 2.2.4 however earlier versions should also work) Your Apache server will need to be suitably configured to allow proxy pass-thru and URL rewriting - we won't cover this here.

Note that this solution 'ties' a session to the first front-end server that services it and only reverts to a different server should the servicing server fail. In this instance (typically) the user would find themselves unexpectedly logged out, unless you are sharing the SESSION information via ZEO, something that's not always as reliable as one might like.

Step by step

First, as a little background, we use this setup to run over 100 shares plone instances over six back-end zope servers on one ZEO server with many hundreds of thousands of visits per day. (so we think it works)

We run apache on a stand-alone Ubuntu Server (well, it's actually a Xen instance) running on the "Ubuntu Gutsy" version.

Pretty much all of our configuration goes into /etc/apache2/httpd.conf as follows;

Defining the nodes

The beauty of this solution is that you don't need to touch the Zope servers or the login processes in order to make Apache handle proper session stickyness. Let's define two Zope servers as follows;

<VirtualHost *:80>
        ServerName       node1
        RewriteEngine    On
        RewriteRule      . -    [E=MYHOST:zope1]
        RewriteRule      . -    [E=MYPORT:8080]
        RewriteRule      . -    [CO=BALANCEID:balancer.%{ENV:MYHOST}:.%{HTTP:X-Forwarded-Server}:1200]
        RewriteRule     /(.*)$  http://%{ENV:MYHOST}:%{ENV:MYPORT}/$1 [L,P]
</VirtualHost>

<VirtualHost *:80>
        ServerName       node2
        RewriteEngine    On
        RewriteRule      . -    [E=MYHOST:zope2]
        RewriteRule      . -    [E=MYPORT:8080]
        RewriteRule      . -    [CO=BALANCEID:balancer.%{ENV:MYHOST}:.%{HTTP:X-Forwarded-Server}:1200]
        RewriteRule     /(.*)$  http://%{ENV:MYHOST}:%{ENV:MYPORT}/$1 [L,P]
</VirtualHost

Each node forwards any request to it's respective Zope server, however prior to forwarding the request sets a cookie that can be referenced later by Apache to redirect subsequent requests from the same session to the same server.

Explaination

[E=MYHOST:zope1]   Sets an environment variable called MYHOST to the value "zope1"
[E=MYPORT:8080]    Sets MYPORT to be "8080"
CO=                This sets a cookie called BALANCEID to be "balancer.MYHOST" within the cookie domain of
                   the requested server, i.e. what's in the X-Forwarded-Server HTTP header line .. and it sets an 
                   expiry time of 1200 seconds on the cookie.

 All in all it's fairly obvious once you get your head around how cookies are handled (i.e. per domain) and how Apache sets up a cookie and which parameters are needed. Strictly speaking we don't need to use variables MYHOST and MYPORT but in real life we relagate the last two lines to an "include" file and use it to shorten the spec for all six Zope instances, and it's also handy documentation.

Defining a Balancer

Apache's design decision re; how to implement balancers has a lot going for it, however ease of definition probably isn't right at the top of the list. Here's an example balancer which would reference the two nodes above.

<Proxy balancer://bronze>
        BalancerMember http://node1  lbset=1 min=1 max=6 smax=16 loadfactor=1 timeout=10 retry=60 route=zope1
        BalancerMember http://node1  lbset=1 min=1 max=6 smax=16 loadfactor=1 timeout=10 retry=60 route=zope2
        BalancerMember http://backup lbset=2 min=1 max=4 smax=8  loadfactor=1 timeout=10 retry=60 status=+H
        ProxySet lbmethod=byrequests stickysession=BALANCEID nofailover=Off maxattempts=2 timeout=20
        ErrorDocument 503 /ERROR_503.html
        Allow from all
</Proxy>

Note that we're referencing a "backup" node that we've not defined, however it would be defined in exacly the same way as zope1/2 above, the difference being we set lbset=2 which means it's only tried after all the rest (with lbset=1) have failed, and there's no stickysession defined so the server won't stick the session to the backup node if both zope1 and zope2 fail. (i.e.  it will fail forward back onto zope1/2 as soon as they become available again)

Note also that the Cookie is set to "balancer.zope<n>" and when we come to specify the router in the balancer section, the router is simply "zope<n>".

Defining a balanced Virtual server

Ok, so now we have a balancer with a couple of sticky Zope servers, what do we do with them?

Here's a sample virtual host configuration that will typically live in /etc/apache2/sites-enabled (under Ubuntu).

<VirtualHost *:80>
   ServerName    linux.co.uk
   DocumentRoot  /tmp
   RewriteRule   ^/(.*) balancer://bronze/VirtualHostBase/http/%{HTTP_HOST}:80/plone/linux/VirtualHostRoot/$1 [L,P]
</VirtualHost>

Note that with Plone we make use of the inbuilt "VirtualHostMonster" to handle URL remapping within Zope itself. So effectively each request is making three passes at Apache.

The initial pass enters the virtual host definition and gets redirected to the balancer.

The second pass enters the balancer and one of two things happen. Either it picks a randon route if it has no session setup for that particular site, or it uses a pre-existing cookie to determine which zope server to forward requests on to.

The third pass activates either zope1 or zope2 and actually forwards the request to the appropriate Zope server, so Apache does all the session tracking work for you without any modifications to Zope, Plone or anything else.

Further information

There's lots of cryptic documentation out on the web about how to do things, I think the main "Plone" examples involve modifying the login process and setting cookies the first time you log in. This is fine for a single application but a non-starter when you're talking about mass Plone hosting.

Useful points of reference;

Emergency Data Recovery >
If you need data recovery, and it's an emergency, check out this site!
Inkjet Printer Inks...
Buy your inkjet printer inks at great savings.
computer printer ink
We are computer printer ink specialists. See our great range of printer ink online. Speak to our team on the rapid order line. Check it out!
IT support...
Make your IT troubles someone else's worries. Connect with Connect.
Dedicated SERVER
See the rates offered for a dedicated server at this site.
BROADBAND INTERNET
With broadband internet, you can talk on the phone while surfing the internet.
Reverse Phone Lookup
Type in any phone number to instantly find out owner's name, address and more.
Laptop
Visit CheckCost UK to compare, review and buy latest computers, laptops, scanners, printers, hard drives, LCD monitors and many more.
Software
Visit Ecost Software to find your favourite brands like Adobe, Apple, Microsoft, Autodesk, Codegear, Corel, Symantec and more.
IT SUPPORT
Award winning IT Services from London provider Wavex offering support, advice, and training