Servlets.com

Home

What's New?

com.oreilly.servlet

Servlet Polls

Mailing Lists

Servlet Engines

Servlet ISPs

Servlet Tools

Documentation

Online Articles

The Soapbox

"Java Servlet
Programming,
Second Edition"

"Java Enterprise
Best Practices"

Speaking & Slides

About Jason

XQuery Affiliate

Free Cache: Come and Get It!
March 9, 2000

by Jason Hunter

In this week's article we take a break from philosophical debate and go heads-down technical with a look at the com.oreilly.servlet.CacheHttpServlet class. I'll show how to use CacheHttpServlet to speed servlet response time by automatically caching response data, and I'll talk about the tricks used in the CacheHttpServlet code that make automatic caching possible.

Client Caching is Good

Servlet programmers should already be familiar with the getLastModified() method. It's a standard method from HttpServlet that a servlet can implement to return when its content last changed. Servers traditionally use this information to support "Conditional GET" requests that maximize the usefulness of browser caches. When a client requests a page they've seen before and have in the browser cache, the server can check the servlet's last modified time and (if the page hasn't changed from the version in the browser cache) the server can return an SC_NOT_MODIFIED response instead of sending the page again. See Chapter 3 of "Java Servlet Programming" for a detailed description of this process.

Server Caching is Better

The problem with this use of getLastModified() is that the cache lives on the client side, so the performance gain only occurs in the relatively rare case where a client hits Reload repeatedly. What we really want is a server-side cache so that a servlet's output can be saved and sent from cache to different clients as long as the servlet's getLastModified() method says the output hasn't changed. CacheHttpServlet provides exactly this.

To implement this server-side caching behavior a servlet must:

  • Extend com.oreilly.servlet.CacheHttpServlet instead of HttpServlet
  • Implement a getLastModified(HttpServletRequest) method as usual

Servlets implementing this trick can have their output caught and cached on the server side, then automatically resent to clients as appropriate according to the servlet's getLastModified() method. This can greatly speed servlet page generation, especially for servlets whose output takes a significant time to produce but changes only rarely, such as servlets that display database results.

It Really Works

In fact, I developed CacheHttpServlet to solve exactly this problem. As you may have seen, Servlets.com maintains a listing of all ISPs known to support servlets, along with an ISP review mechanism where customers can share their experiences with each ISP. The ISP listing is generated from a database, and since I'm not a database expert, I had problems generating the front page quickly. The page had to list all the ISPs with an average overall rating next to each, and I couldn't figure out how to generate the average ratings without doing a separate query per ISP. My solution was to write CacheHttpServlet. Now the page serves immediately, and only after the database changes (an event noted using a timestamp variable in the ServletContext) does the page have to regenerate.

Try a Guestbook Example

The following example shows a servlet taking advantage of CacheHttpServlet. It's a Guestbook servlet that displays user-submitted comments. The servlet stores the user comments in memory as a Vector of GuestbookEntry objects. To simulate reading from a slow database, the display loop has a 0.2 second delay per entry (up to a maximum of 5 seconds). As the entry list gets longer, the rendering of the page gets slower. However, because the servlet extends CacheHttpServlet, the rendering only has to occur during the first GET request after a new comment is added. All later GET requests send the cached response.

import java.io.*;
import java.util.*;
import javax.servlet.*;
import javax.servlet.http.*;

import com.oreilly.servlet.CacheHttpServlet;

public class Guestbook extends CacheHttpServlet {

  private Vector entries = new Vector();  // User entry list
  private long lastModified = 0;          // Time last entry was added

  // Display the current entries, then ask for a new entry
  public void doGet(HttpServletRequest req, HttpServletResponse res)
                               throws ServletException, IOException {
    res.setContentType("text/html");
    PrintWriter out = res.getWriter();

    printHeader(out);
    printForm(out);
    printMessages(out);
    printFooter(out);
  }

  // Add a new entry, then dispatch back to doGet()
  public void doPost(HttpServletRequest req, HttpServletResponse res)
                                throws ServletException, IOException {
    handleForm(req, res);
    doGet(req, res);
  }

  private void printHeader(PrintWriter out) throws ServletException {
    out.println("<HTML><HEAD><TITLE>Guestbook</TITLE></HEAD>");
    out.println("<BODY>");
  }

  private void printForm(PrintWriter out) throws ServletException {
    out.println("<FORM METHOD=POST>");  // posts to itself
    out.println("<B>Please submit your feedback:</B><BR>");
    out.println("Your name: <INPUT TYPE=TEXT NAME=name><BR>");
    out.println("Your email: <INPUT TYPE=TEXT NAME=email><BR>");
    out.println("Comment: <INPUT TYPE=TEXT SIZE=50 NAME=comment><BR>");
    out.println("<INPUT TYPE=SUBMIT VALUE=\"Send Feedback\"><BR>");
    out.println("</FORM>");
    out.println("<A HREF=\"http://www.servlets.com/soapbox/freecache.html\">");
    out.println("Back to the article</A>");

    out.println("<HR>");
  }

  private void printMessages(PrintWriter out) throws ServletException {
    String name, email, comment;
    int numEntries = 0;

    Enumeration e = entries.elements();
    while (e.hasMoreElements()) {
      numEntries++;
      GuestbookEntry entry = (GuestbookEntry) e.nextElement();
      name = entry.name;
      if (name == null) name = "Unknown user";
      email = entry.email;
      if (name == null) email = "Unknown email";
      comment = entry.comment;
      if (comment == null) comment = "No comment";
      out.println("<DL>");
      out.println("<DT><B>" + name + "</B> (" + email + ") says");
      out.println("<DD><PRE>" + comment + "</PRE>");
      out.println("</DL>");

      // Sleep for 0.2 seconds to simulate a slow data source, max at 5.0 sec
      if (numEntries * 0.2 <= 5.0) {
        try { Thread.sleep(200); } catch (InterruptedException ignored) { }
      }
    }
  }

  private void printFooter(PrintWriter out) throws ServletException {
    out.println("</BODY>");
  }

  private void handleForm(HttpServletRequest req,
                          HttpServletResponse res) {
    GuestbookEntry entry = new GuestbookEntry();

    entry.name = req.getParameter("name");
    entry.email = req.getParameter("email");
    entry.comment = req.getParameter("comment");

    entries.addElement(entry);

    // Make note we have a new last modified time
    lastModified = System.currentTimeMillis();
  }

  public long getLastModified(HttpServletRequest req) {
    return lastModified;
  }
}

class GuestbookEntry {
  public String name;
  public String email;
  public String comment;
}

How CacheHttpServlet Works

To view the CacheHttpServlet source code, click on the following link. The code will display in a new window.

Source code for CacheHttpServlet

The basic idea is that, before handling a request, the CacheHttpServlet class checks the value of getLastModified(). If the output cache is at least as current as the servlet's last modified time, the cached output is sent without calling the servlet's doGet() method.

In order to be safe, should CacheHttpServlet detect that the servlet's query string, extra path info, or servlet path has changed, the cache is invalidated and recreated. This may be overly conservative, but generating a page needlessly is far better than returning the wrong page from cache. The class does not invalidate the cache based on differing request headers or cookies; for servlets that vary their output based on these values (i.e. a session tracking servlet) this class should probably not be used, or the getLastModified() method should take the headers and cookies into consideration.

There's no caching performed for POST requests, since they are not idempotent. (Remember that word from the book? It means, in essence, safely repeatable.) This has an interesting side-effect for the Guestbook example -- if you do a page reload after submitting a new comment, you'll see the page generation will take some time even though it's the same page output you already viewed. This is because the submit was a POST request whose output was not cached.

CacheHttpServletResponse and CacheServletOutputStream are helper classes. CacheHttpServletResponse captures all the response information including the response body, status code, and header values. It has to be fairly complicated to support the response.reset() capability added in Servlet API 2.2. CacheServletOutputStream simply wraps around the true ServletOutputStream, capturing everything while it passes content on to the underlying stream.

The code has been built against Servlet API 2.2. Using it with previous Servlet API versions works fine; using it with future API versions likely won't work, because the HttpServletResponse interface that CacheHttpServletResponse must implement will probably change and leave some interface methods unimplemented. Updated versions will always be available at Servlets.com.

It's particularly interesting to look at how CacheHttpServlet catches the request for early processing. It implements the service(HttpServletRequest, HttpServletResponse) method that the server calls to pass request handling control to the servlet. The standard HttpServlet implementation of this method dispatches the request to the doGet(), doPost(), and other methods depending on the HTTP method of the request. CacheHttpServlet overrides that implementation and thus gains first control over the request handling. When the class has finished its processing, it can pass control back to the HttpServlet dispatch implementation with a super.service() call.

By extending this class and implementing the standard HttpServlet.getLastModified() method, a servlet can take advantage of a simple yet powerful server-side caching mechanism.

If you're interested in using this class, you don't need to cut and paste the code. Go download the com.oreilly.servlet package available at http://www.servlets.com/cos.

This class is one of the many new topics covered in Java Servlet Programming, 2nd Edition.


This is the third in a series of articles. Previous articles were "The Problems with JSP" and "Reactions to The Problems with JSP". To be notified when new articles are added to the site, subscribe here.

Care to comment on the article? Fire an email to caching-talkatservlets.com. Interesting comments may be posted unless you specify otherwise.

 


Home   com.oreilly.servlet   Polls   Lists   
Engines   ISPs   Tools   Docs   Articles   Soapbox   Book

Copyright © 1999-2005 Jason Hunter

webmaster@servlets.com
Last updated: March 1, 2009