ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

How Java Web Servers Work

by Budi Kurniawan
04/23/2003

Editor's Note: this article is adapted from Budi's self-published book on Tomcat internals. You can find more information on his web site.

A web server is also called a Hypertext Transfer Protocol (HTTP) server because it uses HTTP to communicate with its clients, which are usually web browsers. A Java-based web server uses two important classes, java.net.Socket and java.net.ServerSocket, and communicates through HTTP messages. Therefore, this article starts by discussing of HTTP and the two classes. Afterwards, I'll explain the simple web server application that accompanies this article.

The Hypertext Transfer Protocol (HTTP)

HTTP is the protocol that allows web servers and browsers to send and receive data over the Internet. It is a request and response protocol--the client makes a request and the server responds to the request. HTTP uses reliable TCP connections, by default on TCP port 80. The first version of HTTP was HTTP/0.9, which was then overridden by HTTP/1.0. The current version is HTTP/1.1, which is defined by RFC 2616(.pdf).

This section covers HTTP 1.1 briefly; enough to make you understand the messages sent by the web server application. If you are interested in more details, read RFC 2616.

In HTTP, the client always initiates a transaction by establishing a connection and sending an HTTP request. The server is in no position to contact a client or to make a callback connection to the client. Either the client or the server can prematurely terminate a connection. For example, when using a web browser, you can click the Stop button on your browser to stop the download process of a file, effectively closing the HTTP connection with the web server.

HTTP Requests

An HTTP request consists of three components:

  • Method-URI-Protocol/Version
  • Request headers
  • Entity body

An example HTTP request is:

POST /servlet/default.jsp HTTP/1.1
Accept: text/plain; text/html 
Accept-Language: en-gb 
Connection: Keep-Alive 
Host: localhost 
Referer: http://localhost/ch8/SendDetails.htm 
User-Agent: Mozilla/4.0 (compatible; MSIE 4.01; Windows 98) 
Content-Length: 33 
Content-Type: application/x-www-form-urlencoded 
Accept-Encoding: gzip, deflate 

LastName=Franks&FirstName=Michael

The method-URI-Protocol/Version appears as the first line of the request.

POST /servlet/default.jsp HTTP/1.1

Related Reading

HTTP: The Definitive Guide
By David Gourley, Brian Totty

where POST is the request method, /servlet/default.jsp represents the URI and HTTP/1.1 the Protocol/Version section.

Each HTTP request can use one of the many request methods, as specified in the HTTP standards. The HTTP 1.1 supports seven types of request: GET, POST, HEAD, OPTIONS, PUT, DELETE, and TRACE. GET and POST are the most commonly used in Internet applications.

The URI specifies an Internet resource completely. A URI is usually interpreted as being relative to the server's root directory. Thus, it should always begin with a forward slash (/). A URL is actually a type of URI. The protocol version represents the version of the HTTP protocol being used.

The request header contains useful information about the client environment and the entity body of the request. For example, it could contain the language for which the browser is set, the length of the entity body, and so on. Each header is separated by a carriage return/linefeed (CRLF) sequence.

A very important blank line (CRLF sequence) comes between the headers and the entity body. This line marks the beginning of the entity body. Some Internet programming books consider this CRLF the fourth component of an HTTP request.

In the previous HTTP request, the entity body is simply the following line:

LastName=Franks&FirstName=Michael

The entity body could easily become much longer in a typical HTTP request.

HTTP Responses

Similar to requests, an HTTP response also consists of three parts:

  • Protocol-Status code-Description
  • Response headers
  • Entity body

The following is an example of an HTTP response:

HTTP/1.1 200 OK
Server: Microsoft-IIS/4.0
Date: Mon, 3 Jan 1998 13:13:33 GMT
Content-Type: text/html
Last-Modified: Mon, 11 Jan 1998 13:23:42 GMT
Content-Length: 112

<html>
<head>
<title>HTTP Response Example</title></head><body>
Welcome to Brainy Software
</body>
</html>

The first line of the response header is similar to the first line of the request header. The first line tells you that the protocol used is HTTP version 1.1, the request succeeded (200 = success), and that everything went okay.

The response headers contain useful information similar to the headers in the request. The entity body of the response is the HTML content of the response itself. The headers and the entity body are separated by a sequence of CRLFs.

The Socket Class

A socket is an endpoint of a network connection. A socket enables an application to read from and write to the network. Two software applications residing on two different computers can communicate with each other by sending and receiving byte streams over a connection. To send a message to another application, you need to know its IP address, as well as the port number of its socket. In Java, a socket is represented by the java.net.Socket class.

To create a socket, you can use one of the many constructors of the Socket class. One of these constructors accepts the host name and the port number:

public Socket(String host, int port)

where host is the remote machine name or IP address, and port is the port number of the remote application. For example, to connect to yahoo.com at port 80, you would construct the following socket:

new Socket("yahoo.com", 80);

Once you create an instance of the Socket class successfully, you can use it to send and receive streams of bytes. To send byte streams, you must first call the Socket class' getOutputStream method to obtain a java.io.OutputStream object. To send text to a remote application, you often want to construct a java.io.PrintWriter object from the OutputStream object returned. To receive byte streams from the other end of the connection, you call the Socket class' getInputStream method, which returns a java.io.InputStream.

The following snippet creates a socket that can communicate with a local HTTP server (127.0.0.1 denotes a local host), sends an HTTP request, and receives the response from the server. It creates a StringBuffer object to hold the response, and prints it to the console.

Socket socket    = new Socket("127.0.0.1", "8080");
OutputStream os   = socket.getOutputStream();
boolean autoflush = true;
PrintWriter out   = new PrintWriter( socket.getOutputStream(), autoflush );
BufferedReader in = new BufferedReader( 
    new InputStreamReader( socket.getInputStream() ));

// send an HTTP request to the web server
out.println("GET /index.jsp HTTP/1.1");
out.println("Host: localhost:8080");
out.println("Connection: Close");
out.println();

// read the response
boolean loop    = true;
StringBuffer sb = new StringBuffer(8096);

while (loop) {
    if ( in.ready() ) {
        int i=0;
        while (i!=-1) {
            i = in.read();
            sb.append((char) i);
        }
        loop = false;
    }
    Thread.currentThread().sleep(50);
}

// display the response to the out console
System.out.println(sb.toString());
socket.close();

Note that to get a proper response from the web server, you need to send an HTTP request that complies with the HTTP protocol. If you have read the previous section, "The Hypertext Transfer Protocol (HTTP)," you can understand the HTTP request in the code above.

Pages: 1, 2, 3

Next Pagearrow