Clustering : Using tomcat and Apache http

Clustering is combining a group of machines/servers into a logical entity that can be referenced as if it were one machine. A cluster is defined as a group of application servers that transparently run a J2EE application as if it were a single entity.

  • Horizontal clustering

    Horizontal clustering, sometimes referred to as scaling out, is adding physical machines to increase the performance or capacity of a cluster pool. Typically, horizontal scaling increases the availability of the clustered application at the cost of increased maintenance. Horizontal clustering can add capacity and increased throughput to a clustered application; use this type of clustering in most instances.

  • Vertical clustering

    Vertical clustering, sometimes referred to as scaling up, is adding WebSphere Application Server instances to the same machine. Vertical scaling is useful for taking advantage of unused resources in large SMP servers. You can use vertical clustering to create multiple JVM processes that, together, can use all of the available processing power.

  • Hybrid horizontal and vertical clustering

Here we are going to cluster 2 tomcat servers using Apache http server.

Setting Up The Nodes

In most situations you would be deploying the nodes on physically separate machines, but in this example we will set them up on a single machine, but on different ports.

  1. Download Tomcat from here. I am using ver 7.0.47.
  2. Lets use a common folder to have all out app and http server. I will be putting everything in C:\Learning\cluster
  3. Unzip Tomcat twice under these folders – C:\Learning\cluster\tomcatNode1 and C:\Learning\cluster\tomcatNode2
  4. Right now both server have the same configuration file like same http port 8080 etc, so if we try to start both servers the second wont start.
    Open server.xml of tomcatNode1 – C:\Learning\cluster\tomcatNode1\apache-tomcat-7.0.47\conf and changing ‘engine‘ element. The “jvmRoute” attribute has to be added – this configures the name of this node in the cluster. The “jvmRoute” must be unique across all your nodes. Keep everything else same.
    ApacheTomcatNode1Confgure1
  5. Now changing other server.xml – C:\Learning\cluster\tomcatNode2\apache-tomcat-7.0.47\conf. Here apart from change in the ‘engine‘ element we need to change the ports also for ‘server‘ , ‘connector’ used by http and ‘connector’ used by AJP.
    ApacheTomcatNode2Confgure1ApacheTomcatNode2Confgure2ApacheTomcatNode2Confgure3
  6. We’re done with Tomcat. Start each node up, and ensure that both gets started at the same time.

Lets talk about AJP  (Apache JServ Protocol). In a web scenario, client to server traffic is usually carried using an http (HyperText Transfer Protocol) transport. That’s both from browser to public facing server, but also in ongoing transfers from the public facing server to other servers which provide content or run business logic in many applications.

An http request comprises a series of lines of data, each new line terminated. The first of these lines comprises the request method (such as GET or POST) followed by the name of the resource required (such as /index.html) followed by a protocol version (such as HTTP/1.1). Subsequent lines include such things as the name of the host being contacted, referrer headers, cookies, the type of the browser, preferred language, and a whole host more details. A server processes an http request and sends out a response. The response comprises a header block, a blank line, and (in most cases) a data block. The first line of the header includes a response code which indicates the success or otherwise of the request – a 3 digit number in the following ranges:

  • 200 and up – success; expect good data to follow
  • 300 and up – good request but only headers (no data). e.g. page has moved
  • 400 and up – error in request. e.g. request was for missing page (404)
  • 500 and up – error in handling request. e.g. program on server has syntax error

This line of the header block is followed by other headers telling the receiving system the content type (Mime type) which allows that receiving system to know whether to handle it as HTML, and a JPEG image, etc. Then there’s a blank line and the actual data.

The https protocol carries the same information as http, but adds to it a secure socket layer (SSL). In other words, the data is encrypted at the client and decrypted at the server, and then the same happens in reverse. The purpose of this encryption is to ensure that stray data packets that are viewed along the way are no use the person who has them – they’re uninterpretable binary data.

The http protocol is quite expensive in terms of band width – it’s an ascii text protocl with words like “POST” and phrases like “Content-type:” taking up more bandwidth than is really needed, and having to be interpreted at destination too. So the ajp protocol (Apache Java Protocol?) was established to allow for much less expensive exchanges between upstream and downstream servers that are to be closely linked. AJP carries the same information as http but in a binary format. The request method – GET or POST – is reduced to a single byte, and each of the additional headers are reduced to 2 bytes – typically, that’s about a fifth of the size of the http packet.

Between servers, http  works very well to some extent but If you have intensive / busy servers with bandwidth issues between them, use ajp as your linking protocol. AJP is a binary protocol that can proxy inbound requests from a web server through to an application server that sits behind the web server.

Installing Apache Http Server

  1. Download Http Apache server – Binaries – (I used httpd-2.2.25-win32-x86-openssl-0.9.8y.msi)
    ApacheHttpInstall0
  2. Start installation with following attributes.
    ApacheHttpInstall1
  3. Use custom install and change the default path (not necessary though, just to keep everything in one place)
    ApacheHttpInstall2   ApacheHttpInstall3
    ApacheHttpInstall4

Which connector to use?

As we know, AJP is an optimized version of the HTTP protocol to allow a standalone web server such as Apache to talk to Tomcat. The idea is to let Apache serve the static content when possible, but proxy the request to Tomcat for Tomcat related content. There are mainly 3 ways to do it:

  • mod_jk is mature, stable and extremely flexible. It is under active development by members of the Tomcat community.
  • mod_proxy_ajp is distributed with Apache httpd 2.2 and later. Note that the communication protocol used is AJP.
  • mod_proxy_http is a cheap way to proxy without the hassles of configuring JK. If you don’t need some of the features of mod_jk, this is a very simple alternative. Note that the communication protocol used is HTTP.

We would be using mod_proxy_ajp.

Setting Up The Apache Cluster

  1. Open httpd.conf (C:\Learning\cluster\apache2.2\conf) and uncomment following lines. This enable the necessary mod_proxy modules in Apache.
    ApacheHttpConfgure1
  2. Add the foloowing at the end of the file
    <Proxy balancer://testcluster stickysession=JSESSIONID>
    BalancerMember ajp://127.0.0.1:8009 min=10 max=100 route=node1 loadfactor=1
    BalancerMember ajp://127.0.0.1:8019 min=20 max=200 route=node2 loadfactor=1
    </Proxy>
    
    ProxyPass /examples balancer://testcluster/examples
    

The above is the actual clustering configuration. The first section configures a load balancer across our two nodes. The loadfactor can be modified to send more traffic to one or the other node. The “route” setting here must match the names of the “jvmRoutes” in the Tomcat server.xml for each node.
This in conjunction with the “stickysession” setting is key for a Tomcat cluster, as this configures the session management. It tells mod_proxy to look for the node’s route in the given session cookie to determine which node that session is using. This allows all requests from a given client to go to the node which is holding the session state for the client.
The ProxyPass line configures the actual URL from Apache to the load balanced cluster. You may want this to be “/” e.g. “ProxyPass /balancer://testcluster/” In our case we’re just configuring the Tomcat /examples application for our test.

Lets Test

First we need to check if Apache Http server is running or not. Open http://localhost
ApacheHttpRun1

You can start/stop apache http server using UI.
ApacheHttpStartStop1       ApacheHttpStartStop2

Now Open http://localhost/examples.

You might get –

ApacheHttpRun2

 

Start node 1 and check.
ApacheHttpRun3

Now stop node 1 and start 2, and check. It should be same.

Balance Manager Setup

mod_proxy has an additional “balancer manager” component which provides a web interface to the load balanced cluster.

  1. Add the following line in your httpd.conf (C:\Learning\cluster\apache2.2\conf)
    <Location /balancer-manager>
    SetHandler balancer-manager
    AuthType Basic
    AuthName "Balancer Manager"
    AuthUserFile "C:/Learning/cluster/apache2.2/conf/.htpasswd"
    Require valid-user
    </Location>
    
  2. Now we need to create a password file to secure it.  Open command prompt and navigate to – C:\Learning\cluster\apache2.2\bin
    Now run the following command
    htpasswd.exe -c C:\Learning\cluster\apache2.2\conf\.htpasswd admin

    ApacheHttpBalancerManegerPassword
    Enter the password.

  3. Now open – http://localhost/balancer-manager
    Login prompt will appear- Enter ‘admin’ as username ans password that you provided previously.
    ApacheHttpBalancerManegerUI

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: