CSC 241- Set5

CSC 241- Set #5

Linked List

You are all familiar with the use of arrays to keep a set of elements in memory. There are many good things about arrays--they allow us to randomly access cells, thats vital in some applications, addressing cells is also easy since each cell has a nice numeric address and each comes after the other. There are applications where we need dynamic space allocation and that is why we introduce you to linked lists.

Here are few reasons for why arrays might not meet our storage needs perfectly:

Arrays are static, they don't grow when you want them to magically.
If you need to insert in the middle of an array, you must first make room for the new element. For instance, if the new value needs to go into position 10, elements in position 10 and on have to move to make room for it. Removing values cause the same sort of problem.
There is also no concept of physically removing a cell, if you allocate 100 cells, you have 100 cells forever.
We may be working on an application where we can't predict the number of cells that we need.
There is a concern about the arrays space utilization. To be sure that we have enough room, we sometimes need to allocate a large number of cells. This could lead to programs that cause excessive disk cashing. In such cases, the operating system can't keep the whole program in memory at one time. Every time a portion of the program is addressed that is not in memory, the system needs to read it from the disk and unload a different portion of the program that is in memory. Of course, none of this activity has anything to do with what the program is trying to do. I guess, it is like the program spinning its wheels once in a while.
To make sure that operations like insertion or deletion are performed efficiently, we end up developing algorithms that are complex and harder to follow. The circular queue is a good example, it uses the array space efficiently, but it is harder to follow than the simple version.

Linked lists can be implemented with arrays, but typically in languages like Java that support dynamic allocation, they are not. You can create cells as you need them, and can remove them when you wish. But, you don't have a nice numeric address like you do with arrays. The address of cells are NOT in sequence nicely like they are in arrays, in fact, we don't refer to them as cells, we call them nodes. You have to get each element of the sequence to remember the address of the next. In the object oriented paradigm, we say that each node keeps a reference to the next node. sometimes, we design our nodes to even keep a reference to a previous node, we refer to linked lists with such nodes as doublely-linked lists. Other times we design the linked list, so that the last node points to the first one. Such linked lists are referred to as circular linked lists. Linked lists are a tremendous asset. We add nodes as we need to, we can put a node between two other nodes by changing their references. In java, we can delete a node by simply making their previous node reference their next node, if we have a doublely linked list, their next node will now reference their previous node also. Java's Garbage Collector will collect any space that is left unreferenced. So, by simply not referencing a node we have effectively deleted it. Lets look dynamicQueue implementation of Queue as an example.

Dynamic Queue

This example will help us see how such implementation can allow our data structure to grow and shrink as it needs to; you gain flexibility in space allocation when you implement a queue this way. Let us consider the dynamic Queue class that implements the Queue Interface. First of all, it uses a simple linked list concept to implement a queue. Two fields are defined for each dynamicQueue object, one to reference to the Qnode that is in front of the linked list (front_), and the other references the node in the back of the queue (back_). Both fields are set to null at construction.

The code for Qnode.java is straightforward,. There are two fields, element_ (the object being held) and next_ (the next node in the chain). Since nothing is public here, Qnode is only accessible in the Queue package, and of course, only used by the dynamicQueue class.

In the enqueue method, a new element is put into a new node object and added to the back of the queue. if the queue is empty, front_ and back_ both reference the new node with the new object; otherwise, back_.next_ references the new node and then back_ is changed to reference the new node as well. It is important to undertsand the back_=back_.next_=new Qnode(x,null);; it does three things, 1st it creates a new node with the new object, it binds the new node to back_.next_, and lastly, it sets back_ to reference the new node.

  public void enqueue(Object x)    throws Exception {
    if (full()) 
      throw new Exception();
    if (front_ == null) //empty queue
      front_=back_=new Qnode(x,null);
    else   //last element in queue
      back_=back_.next_=new Qnode(x,null);
  }

We first check for full which can't happen. Then, we check to see if we are empty, this only happens in this implementation when we 1st create our queue. If front_ is null, both fron_ and back_ need to be set to a new Qnode object with the object being enqueued in it; the null value for the second argument will set next_ of this node to null. Consider the following list:


     jones ---> fair ---> mary ---> leo --->||
front_ ^                        back_^

Here is what happens when we enqueue mark:


     jones ---> fair ---> mary ---> leo --->mark --->||
front_ ^                                back_^

Let use consider dequeue:

  public void dequeue() throws Exception {
    if (empty()) 
      throw new Exception();
    if (front_.next_ == null)
      front_=back_=null;//remove last element.
    else
      front_=front_.next_;
  }

We throw an exception if queue is empty. if the queue is not empty, we need to now figure out if there is only one element on the queue. if (front_.next_ == null) is true, both fron_ and back_ are set to null. Otherwise, front_ is set to its own next_ field, in effect, removing the front element.

Testing queue

test.java is designed to test either version of queue. As you recall, queue itself is an interface with no implementation of its own. We extended it to have a fixed (array) and a dynamic (linked list) implementation. It is designed so that user can specify a datafile, the version to test, and if necessary, the number of cells for the array in the "fixed" version. This information is taken from the command line. Here are two examples of running this program:

java test test.dat fixed 10 will test the fixed version using test1.dat as the data file. Also, the size 10 is chosen here for the array that keeps the queue elements.

java test test.dat dynamic will test the dynamic version using test.dat as the data file. Notice that no size is specified, as it is not relevant in the dynamic case.

Command Line Parameters

Processing of command line parameters is mostly useful in utility programs, such as, mv and ls. In test.java, it helps to test the different versions queue quickly, without engaging in long dialog with the program to set things up.

In test.java, the main method has an array of strings as a parameter (arg). This array is filled with the space separated strings on the command line that appear after the program name. arg[0] in the examples shown above is test.dat. When looking at the code for this program, you will notice that the file opened for input gets its name from arg[0].

Comparing Objects

One of the things that I like you to notice is the use of the equals method when trying to figure out which version is being tested. You ask yourself, why couldn't we simply say if (arg[1] == dynamic). The reason is simple, "==" testes equality of the references and not what the objects contain. In other words, we can only find out if arg[1] and dynamic are both referencing the same memory cell.

Delayed Construction

Note that the declaration of q occurs before we know which kind the user is testing. Recall that you can't make objects of queue, it is an abstract class. Once the program figures out which version is to be tested, either q = new dynamicQueue();, or q = new fixedQueue(size); are used to create the right version of the queue.

Polymorphism

Consider the code segment:


    System.out.println("\nHere is what is left in the queue:\n");
      while (!q.empty()) {
        try {
          text = (String)q.peek();
          q.dequeue();
          System.out.println(text+" was removed from the queue");
        }
        catch(Exception e){
          System.out.println("failed to dequeue an element");
        }
      }

What if we want to take this out of the main method and put in its own method, say, named out. It would be invoked from main with a call like out(q);. Here is the catch, what kind parameter, do you think, out needs to accept? It needs to be queue, as you don't know which implementation is used, so how can you use either of its subclasses. But, this not a problem, as this is one of the fundamental advantages for Object Orientation which we call Polymorphism. Here is a look at the out method:


  static void out(queue q) {
    System.out.println("\nHere is what is left in the queue:\n");
      while (!q.empty()) {
        try {
          text = (String)q.peek();
          q.dequeue();
          System.out.println(text+" was removed from the queue");
        }
        catch(Exception e){
          System.out.println("failed to dequeue an element");
        }
      }
  }