LinuxDevCenter.com

oreilly.comSafari Books Online.Conferences.
Sign In/My Account | View Cart   

We've expanded our Linux news coverage and improved our search! Search for all things Linux across O'Reilly!

Search
Search Tips

advertisement


Listen Print Subscribe to Linux Subscribe to Newsletters

Ads in Cache-Friendly Pages

by Jennifer Vesperman
03/21/2002

You want to make your Web pages cache-friendly, but you're worried you'll lose advertising revenue. Ads are a reality in our current model of the Web, and it's important to ensure that they work and to keep track of how often they are served. Fortunately, it's easy to make ads work without losing cache friendliness in our pages.

To explain how to make ads uncacheable while having a cacheable page, I need to explain entities and selectively applying Cache-control headers. For a detailed description of Cache-control headers, see Cache-Friendly Web Pages.

The principles described here work for all the major elements in a Web page. Once you understand them, you can selectively apply cache headers to every entity on your pages.

Entities and cache control

HTTP is a client-server protocol. The client initiates the contact, asking the server to provide the entity described in the URI (Uniform Resource Identifier). A URI can be a location (a URL) or a name (a URN). URLs are the most common, but HTTP can handle either.

The server returns a response. The response consists of some HTTP headers and an "entity." The entity includes entity headers and a body that is (we hope) the requested data. Cache-control headers and Expires headers are entity headers, and they apply only to the entity they are included in.

Related Reading

Web Caching

Web Caching
By Duane Wessels

Table of Contents
Index
Sample Chapter

Read Online--Safari Search this book on Safari:
 

Code Fragments only

Inside the entity, especially if the entity is HTML, there may be URIs referring to additional entities. HTML image tags are the most common of these, and Web browsers read image tags and then send additional GET requests for the new entities to Web servers.

Any time you include a URI (relative or full) in your HTML, you may be referring to another entity. The exception is a fragment token, #foo, in the same entity it refers to. If stripping the fragment off would leave you pointing to the same page, it's the same entity. Other than that, every distinct URI refers to a distinct entity.

The browser (or other client) will make a separate request to pull down each entity. In the case of a link, it waits for the user to initiate the request. In the case of images, it usually initiates the request itself. (Some browsers do not request images, or do so only on user-initiated request.) Other included objects may or may not be automatically downloaded; please see the HTML specification to determine what the browser is expected to do with them.

All images, including images that are ads, are individual entities. And individual entities have their own, individual Cache-control or Expires headers. So we can have ads in cache-friendly Web pages without actually caching the ads.

This fact also allows us to have images with very long expiry times in frequently changing (and rapidly expiring) Web pages, and to have other elements uncacheable.

  • Set the expiry of the main Web page to whatever expiry is appropriate for that page.
  • Set the expiry of unchanging images or other included entities to the maximum (a year).
  • Set expiries of anything you want not cached to "do not cache" using either Cache control or a zero expiry time. You may want to do this for ads.

Setting ads to be uncacheable is one way of attempting to count the hits accurately. It's a rather unfriendly way of doing it, though; it forces the reloading of a (usually) static object, every time.

Pages: 1, 2

Next Pagearrow




Tagged Articles

Be the first to post this article to del.icio.us

Sponsored Resources

  • Inside Lightroom
Advertisement
O'Reilly Media
© 2008, O'Reilly Media, Inc.
(707) 827-7000 / (800) 998-9938
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
About O'Reilly
Privacy Policy
Contacts
Authors
Press Room
Jobs
User Groups
Academic Solutions
Newsletters
Writing for O'Reilly
RSS Feeds
Other O'Reilly Sites
O'Reilly Radar
Ignite
Tools of Change for Publishing
Digital Media
Inside iPhone
O'Reilly FYI
makezine.com
craftzine.com
hackszine.com
perl.com
xml.com
Sponsored Sites
Inside Aperture
Inside Lightroom
Inside Port 25
InsideRIA
java.net