ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

The Performance of Java's Lists
Pages: 1, 2, 3

The LinkedList implementation

LinkedList is implemented using a list of doubly-linked nodes. To access elements by index you need to traverse all the nodes until the indexed node is reached:



public Object get(int index) {
  //check the index is valid first .. code not shown here
  Entry e = header;  //starting node
  //go forwards or backwards depending on which is closer
  if (index < size/2) {
    for (int i = 0; i <= index; i++)
      e = e.next;
  } else {
    for (int i = size; i > index; i--)
      e = e.previous;
  }
  return e;
}

Inserting elements into the list is straightforward: traverse to the node at the index and insert a node immediately before that node:

public void add(int index, Object element) {
  //check the index is valid first .. code not shown here
  Entry e = header;  //starting node
  //go forwards or backwards depending on which is closer
  if (index < size/2) {
    for (int i = 0; i <= index; i++)
      e = e.next;
  } else {
    for (int i = size; i > index; i--)
      e = e.previous;
  }


  Entry newEntry = new Entry(element, e, e.previous);
  newEntry.previous.next = newEntry;
  newEntry.next.previous = newEntry;
  size++;
}

Thread-safe LinkedLists and other collections

If you need a thread-safe LinkedList from the Java SDK, you can obtain one using a synchronized wrapper from the Collections.synchronizedList(List) method. However, synchronized wrappers add a level of indirection which can have a high performance cost. A synchronized wrapped method can be up to two or three times slower than the unwrapped method; every method requires an extra method call as the wrapper redirects the call to the wrapped method. This indirection is less significant with complex methods, such as searches, but with simpler methods such as accessors and updaters the extra overhead can really slowdown performance.

This means that a synchronized wrapped LinkedList is at a significant performance disadvantage to Vector, as Vector does not require any extra indirection for thread-safety. If you require a thread-safe LinkedList, you will obtain a faster implementation if you copy the LinkedList class and make the required methods synchronized. This is also true of all the other collection classes: only List and Map have efficient thread-safe implementations (the Vector and Hashtable classes respectively). Amusingly these two efficient thread-safe classes exist purely because of backwards compatibility and not for performance considerations.

The LinkedList implementation has a performance overhead for indexed access and update of elements, since access to any index requires you to traverse multiple nodes. Adding elements to the collection suffers from the index traversal access performance drawback and, in addition, has a further overhead in requiring the creation of a node object. On the plus side, there is no further overhead to insertions and deletions, so insertions-deletion overhead is really mostly dependent on how far away the insertion-deletion index is from the ends of the collection.

Performance tests

There are many different functions of the classes that could be tested. LinkedLists are frequently used because of their supposed better performance for random index insertion and deletion, so I decided to focus on insertion performance, i.e. building collections. I've tested LinkedList against ArrayList since both are unsynchronized (see the "Thread-safe LinkedLists and other collections" sidebar).

The insertion speed is critically dependent on the size of the collection and the position where the element is to be inserted. All the best and worst performance cases arise when inserting either at one of the ends or at the exact middle point of the collection. Consequently, I've chosen three insertion locations (start, middle, end of the collection), and three representative collection sizes of medium (100 elements), large (10,000 elements), and very large (1,000,000 elements).

For my tests I used the Sun JVMs from the Java SDK 1.2.0 and 1.3.0 release. In addition I also tested with HotSpot JVM 2.0, the version available with the 1.3.0 SDK. All measured times in each table have been normalized to one of the SDK 1.2 VM tests (the cell showing 100% in each table). The default JVM configuraton with JIT enabled was used, though heap space needed to be increased for all JVMs to avoid out-of-memory errors. Times recorded were the averages over several test runs. In order to avoid garbage collection interference, I forced a full scavenge of all memory between each test (see the source code for details). The disk was monitored to ensure that disk paging did not occur (any test runs which showed significant paging were discarded). Each test that was too slow to give a multiple second response time was looped repeatedly until a significantly long enough time was recorded.


Table 1: Building a medium sized collection (100 elements). Figures in brackets are for pre-sized collections

 

1.2 JVM

1.3 JVM

HotSpot 2.0 JVM

Always inserting at the start of the ArrayList

100% (48.0%)

184.9% (152.0%)

108.0% (66.7%)

Always inserting at the start of the LinkedList

135.5%

109.1%

85.3%

Always inserting at the mid-point of the ArrayList

130.0% (40.6%)

187.4% (158.0%)

84.7% (46.0%)

Always inserting at the mid-point of the LinkedList

174.0%

135.0%

102.3%

Always inserting at the end of the ArrayList

63.3% (20.7%)

65.9% (25.0%)

60.3% (29.3%)

Always inserting at the end of the LinkedList

106.7%

86.3%

80.3%

For short collections, ArrayList and LinkedList are pretty close in performance. ArrayLists have the edge when inserting to the end of the collection, i.e. appending elements. But then appending elements is the single operation that ArrayList is most optimized for: if you just want a statically sized collection, a Java array (e.g. Object[]) gives better performance than any collection object. Beyond the append operation, measured timings are very mixed and reflect various JVM optimizations more than anything else.

For example, for insertions to the start of the collection (first two rows of table 1), the HotSpot 2.0 JVM with LinkedLists giving the fastest time (85.3%), and the second fastest time is the 1.2 JVM with ArrayList (100%). These two results illustrate that the simple JIT compiler in 1.2 is very efficient at doing simple things like iterating over and copying arrays. The more sophisticated JVM with optimizing compiler in HotSpot is able to improve more complex activities such as object creation (of LinkedList nodes) and take advantage of code-inlining. The 1.3 JVM results seem to indicate a severe underperformance in simple operations which is likely to be improved in later JVM versions.

What I have only partially measured here are other advantages that ArrayList has over LinkedList, namely the ability to pre-size collections, and the reduced garbage collection overhead of ArrayLists. Specifically, ArrayLists can be created with a particular size (e.g. in this test the ArrayList could be created with a capacity of 100 elements), thus avoiding all the growth overheads. The figures in brackets in table 1 show the improvements possible with pre-sizing collections. LinkedLists (up to SDK 1.3) cannot be pre-sized.

Pages: 1, 2, 3

Next Pagearrow