Tomcat Architecture

Tomcat is a container that is made up of pluggable components that fit together in a nested manner. Tomcat is configurable you can set such settings to use specialized filters, change port numbers and IP address bindings, security settings, etc. You should always change the default setting when using in a production environment especially the security aspects, I will touch on this in the Tomcat security section. At first Tomcat configuration files appear cryptic and difficult to understand, but in future articles I will try to unpick the complexity of Tomcat into a easy understandable format.

Tomcat Directory Overview

I have already touched on this already in Tomcat Installation, I will now dive deeper into what files are used within each directory

Directory Files Description
bin bootstrap.jar
commons-daemon.jar
tomcatuli.jar
startup.bat
catalina.sh
This directory hold some of the JAR files that are required when starting Tomcat, it also holds the startup files themselves, the startup.bat used to start the Tomcat as a daemon process, the catalina.sh can be used on a commandline and to add additional parameters to change Tomcat when starting.
conf catalina.policy contains security policy statements that are implemented by the Java SecurityManager. It replaces the java.policy file that comes with the JVM, it prevents rogue code of JSPs from executing damaging code that can affect the container. It is only used once when Tomcat is launched thus you need to restart Tomcat if you change this file
catalina.properties contains a list of Java packages that cannot be overridden by executable Java code in servlets or JSPs which could be a security risk.
context.xml this file is used by all Web applications, it explains where the web.xml should be accessed
logging.properties this file details the logging within Tomcat, two default configuration are setup a ConsoleHandler and a FileHandler, you can change the logging level using this file.
server.xml this is the main configuration file in Tomcat, it is used by the "digester" to build the container on startup
tomcat-users.xml Used for security to allow access to the Administration applications section, it is used with the default UserDatabase Realm as referenced in server.xml.
web.xml The default web.xml file that is used by all Web applications, it sets up the JSPServlet to allow your applications to handle JSPs and a default servlet to handle static resources and HTML files. It also sets up default session times outs, welcome files and MIME types.
lib number of JAR files all the JAR files that the container uses are located here, this includes Tomcat JAR's and the servlet and JSP application programming interfaces (API's). Place your own JAR files here if they will be used across all your Web applications.
logs number of log files contains a number of logs files, these are produced by JULI logging which will be discussed in a later topic. The logs are rotated each day, so you may need to clear them down from time to time.
temp ? used for scratch files and temporary use
webapps Web app files

this is were the Web application files reside, including your own Web applications. This is were you place your Web Application aRchive (WAR) file, Tomcat will then deploy the file. We will get into deploying Web applications in another topic.

There are several default Web application that come with Tomcat:

  • ROOT - The welcome screen that you saw when you first installed Tomcat. This is a special directory called "/", this gets removed when you move into production. From this web you can access all the below Web applications
  • docs - contains the Tomcat documentation
  • examples - contains some JSP and servlet examples
  • host-manager - allows you to manage the hosts that run in your application, use the /host-manager/html URL to access
  • manager - allows you to manage your applications in Tomcat, you can start, stop, reload, deploy and undeploy your applications. Use /manager/html/ URL to access
work   used for temporary working files, it is used heavy during JSP compilation where the JSPs are converted to a Java servlet and accessed through this directory.

Tomcat Architecture Overview

Tomcat 6 consists of a nested hierarchy of components, containers are components that can contain a collection of other components. The below diagram display how the Tomcat architecture looks, some of the components can be contained multiple times are denoted by a symbol that has multiple profiles, including Connector, Logger, Valve, Host and Context.

Server The server is Tomcat, its an instance of the Web application server, it owns a port that is used to shutdown the server (port 8005). You can setup multiple servers on one node providing they use different ports. The server is an implementation of the Server interface, it implements the StandardServer object.
Service A service groups a container (usually an engine) with a set of Connectors. The service is responsible for accepting requests, routing them to the specified Web application and specific resources and then returning the result of the processing of the request, they are the middle man between the clients web browser and the container.
Connectors

Connectors connect the applications to clients. They receive the incoming requests HTTP (port 8080) or AJP (port 8009) by default from the clients.

The default connector is Coyote which implements HTTP 1.1.

Engine

The engine is the top-level container, it cannot be contained by another container, thus this is the parent container for all the containers beneath it. The engine is a request-processing component that represents the Catalina Server Engine.

It examines the HTTP headers to determine the virtual host or context to which requests should be passed. An engine may contain Hosts representing a group of Web applications and Contexts representing a single Web application i.e. a virtual host

Realm The realm for an engine manages user authentication and authorization. Resources uses roles to allow access, the realm enforces the security polices. A realm applies across the whole engine, however this can be overridden by using a realm at the Host level or the Context level, it a object that can be superceded by its children objects.
Valves

Valves are used to intercept a request and preprocess it. They are similar to filter mechanism of the servlet specifications but are specific to Tomcat .

They are used for single sign-on for all Hosts on a Server, log request patterns, client IP addresses, server usage patterns.

You can have multiple Valves at its particular parent and are typically chained in the order they were added to the parent. This means that you may have Valves that depend on other Valves

Loggers Loggers report on the internal state of a component. Logging behavior is inherited, so the logger from the engine is assigned to every child object unless overridden.
Host Hosts mimics the popular Apache virtual host concept, the Host contains a name and an IP address. You can multiple Hosts each with its own Web application.
Context

The Context is the Web application, also known as a Context. You can enable dynamic reloading so that any classes that have been changed are reloaded into memory. The Context can also have specific error pages for each Web application, you can also setup initialization parameters to control access.

The context implements the Context interface, most Context implementations are created with the StandardContext class.

Because the Context itself is a container at the Web application level, it becomes the parent of servlets and filters, the Contexts add these children as StandardWrapper class.

Connector Architecture

All connectors work on the same principle, they have an Apache module end(mod_jk or mod_proxy) that loads just like any other Apache module.

On the Tomcat end, each Web application instance has a connector module component written in Java. In Tomcat 6 this is with the org.apache.catalina.Connector class. The constructor takes one of two connector types, HTTP/1.1 or AJP/1.3. You call the constructor indirectly via the server.xml file using the connector and protocol tags. Depending on what setup you have, different classes will be used

Apache Portable Runtime (APR) is supported
  • HTTP/1.1: org.apache.coyote.http11.Http11AprProcotol
  • AJP/1.3: org.apache.coyote.ajp.AjpAprProtocol

Note: see APR installation for more information on APR

APR is not supported
  • HTTP/1.1: org.apache.coyote.http11.Http11Procotol
  • AJP/1.3: org.apache.jk.server.JkCoyoteHandler

The Web server handles all the static content, but when it comes across content intended for a servlet container, it passes it to the module in question (mod_jk, mod_proxy), the web server knows what content to pass to the Connector module because the directives in the Web servers configuration specify this.

The Apache JServ Protocol (AJP) uses a binary format for transmitting data between the Web server and Tomcat, a network socket is used for all communication. The AJP packet consist of a packet header and a payload, below is the structure of the packet

As you can see, the binary packet starts with the sequence 0X1234, this is followed by the packet size (2 bytes) and then the actual payload. On the return path the packets are prefixed by AB (the ASCII codes for A and B), the size of the packet and then the payload.

The HTTP protocol is exactly as the name implies it uses the HTTP protocol to exchange messages. You can use HTTPS but you require a SSL certificate and make a few changes to Tomcat's configuration.

Lifecycle

Tomcat starts and stops the components in the order that were started, thus when starting the parent gets started first then the children get started, stopping is the reserve order. This is done through the Lifecycle interface: LifecycleEvent and LifecycleListener.

The Lifecycle interface has two key methods start() and stop(), all major components usually contain a LifecycleSupport object that manages all of the LifecycleListener objects for that component, it is this object that propagates and fires general events. The top-level component calls all of its child's start() methods, the reverse is true when stopping. This method allows to to stop/start Host components without affecting any other Hosts.

The LifecycleListener interface can be added at any level in the Tomcat container that can execute specific code when a particular event is fired. By default there are three listeners configured at the server level, they are configured in the server.xml or context.xml file at the specific level.

Configuration

The most important file in Tomcat is the server.xml file, when Tomcat starts it uses a version of the Apache Commons Digester to read the file, the digester is a utility that reads XML file and creates Java objects from a set of rules. With what you have learned above you can see that the rules in the file follows Tomcat architecture exactly.