CSC 241- Set #4


Linked List

You are all familiar with the use of arrays to keep a set of elements in memory. There are many good things about arrays--they allow us to randomly access cells, thats vital in some applications, addressing cells is also easy since each cell has a nice numeric address and each comes after the other. There are applications where we need dynamic space allocation and that is why we introduce you to linked lists.

Here are few reasons for why arrays might not meet our storage needs perfectly:

  1. Arrays are static, they don't grow when you want them to magically.
  2. If you need to insert in the middle of an array, you must first make room for the new element. For instance, if the new value needs to go into position 10, elements in position 10 and on have to move to make room for it. Removing values cause the same sort of problem.
  3. There is also no concept of physically removing a cell, if you allocate 100 cells, you have 100 cells forever.
  4. We may be working on an application where we can't predict the number of cells that we need.
  5. There is a concern about the arrays space utilization. To be sure that we have enough room, we sometimes need to allocate a large number of cells. This could lead to programs that cause excessive disk cashing. In such cases, the operating system can't keep the whole program in memory at one time. Every time a portion of the program is addressed that is not in memory, the system needs to read it from the disk and unload a different portion of the program that is in memory. Of course, none of this activity has anything to do with what the program is trying to do. I guess, it is like the program spinning its wheels once in a while.
  6. To make sure that operations like insertion or deletion are performed efficiently, we end up developing algorithms that are complex and harder to follow. The circular queue is a good example, it uses the array space efficiently, but it is harder to follow than the simple version.

Linked lists can be implemented with arrays, but typically in languages like Java that support dynamic allocation, they are not. You can create cells as you need them, and can remove them when you wish. But, you don't have a nice numeric address like you do with arrays. The address of cells are NOT in sequence nicely like they are in arrays, in fact, we don't refer to them as cells, we call them nodes. You have to get each element of the sequence to remember the address of the next. In the object oriented paradigm, we say that each node keeps a reference to the next node. sometimes, we design our nodes to even keep a reference to a previous node, we refer to linked lists with such nodes as doublely-linked lists. Other times we design the linked list, so that the last node points to the first one. Such linked lists are referred to as circular linked lists. Linked lists are a tremendous asset. We add nodes as we need to, we can put a node between two other nodes by changing their references. In java, we can delete a node by simply making their previous node reference their next node, if we have a doublely linked list, their next node will now reference their previous node also. Java's Garbage Collector will collect any space that is left unreferenced. So, by simply not referencing a node we have effectively deleted it.

Picture a node as an object that has two parts. One part containing data, as many pieces as needed, and one or more references to other nodes. We will first look at the definition of a singlely linked list. Notice that it is as if a node can be inside another node. Such recursive definitions are common for linked lists. Also notice the constructor, it is designed to allow for setting both the data part of the node and the reference to the next node. The last thing you should notice lack of a qualifier like private or public. This simply means that Node is accessible in the package that it sits in. So, for example if we design a Node class as part of the stringQueue package we get to use it to create nodes in our dynamicStringQueue class which is in the same package. More about this later.


class Node {
   dataType element;
   Node next;
   Node (Node n, dataType e) {
     next=n;
     element=e;
   }
}

Here is a graphical image of what the linked lists would look like, if we were simply holding some numeric values:


10 ---> 20 ---> 30 ---> 40 --->||

Here is the definition of a doublely linked list, note that we have an additional reference to a node (prev).

class Node {
   dataType element;
   Node next;
   Node prev;
   Node (Node n,Node p, dataType e) {
     next=n;
     prev=p;
     element=e;
   }
}

Here is a graphical image of what a doublely linked list would look like, if we were simply holding some numeric values:


||<--- 10 <---> 20 <---> 30 <---> 40 --->||

When implementing any variation of linked lists, you must have a reference to at least one node in the list.


Node head=null;

In this statement we are declaring a variable head which is set to null. null is a value in java that means empty and we use it when setting or checking reference variables. Consider head as a variable that holds a reference to an object of class Node, of course, it is null at this point.


head=new Node(null,10);

In this statement we are making head reference a newly created node that will contain 10 with the nextreference as null. of course, we are assuming that our dataType is int and that we are using the first Node class shown above. It should be obvious that we can't have two classes with the same name, so, if we need two node classes in a package, we need to name them differently.

Consider the following list of elements with head referencing the node that contains 10:


  10 ---> 20 ---> 30 ---> 40 --->||
 head

Lets examine a few statements and see what happens to the list:

head.element=12;

  12 ---> 20 ---> 30 ---> 40 --->||
 head


head.next.element=25;

  12 ---> 25 ---> 30 ---> 40 --->||
 head

Create a new node referenced by temp. This node will contain 17 with its next field set to what head's next is. Sorry about the crude graphics.


Node temp=new(head.next,17);  

  12 ---> 25 ---> 30 ---> 40 --->||
 head     ^
          |
         17
        temp

The following statement, effectively, adds temp's node to the list. head.next=temp;


  12 ---> 17 ---> 25 ---> 30 ---> 40 --->||
  head   temp

temp.next=temp.next.next; //remove the node that contains 25!
         
  12 ---> 17 ---> 30 ---> 40 --->||
   head   temp

Dynamic String Queue

As you may be able to tell, you gain incredible flexibility when you implement a queue with a linked list. Let us consider dynamicStringQueue class. First of all, it uses a doublely linked list which means the second Node class shown above is utilized. Note that two reference variables are used to keep track of both front of the list and the back of the list, first and last. This will facilitate easy insertion and deletions for our queue. Both variables start as null so the constructor doesn't have any code. Just checking last or first against null should determine if if queue is empty. You will find that full always returns false as you can't get full in this implementation. Lets take a careful look at enqueue and dequeue.

  public void enqueue(String x) {
    if (first==null) {
      first=new Node(null,null,x);
      last=first;
    }
    else {
      last.next=new Node(null,last,x);
      last=last.next;
    }
  }

We first check for first being null, that helps us figure out if we need both first and last set, or we only need to add a new node to the end of the list. The statement first=new Node(null,null,x); sets first to reference a newly created node. This new node will contain the new value in it with both its prev and next node references set to null. Note that last gets to reference the same node as first does.

When there are elements already in the list, we create a new node that we place after last. This new node will have a pre node reference that will be set to last. last=last.next makes last reference the new node. Consider the following list:


||<--- jones <---> fair <---> mary <---> leo --->||
       first                             last

Here is what happens when we enqueue mark:


||<--- jones <---> fair <---> mary <---> leo <---> mark --->||
       first                                      last

Let use consider dequeue:
  public String dequeue() {
    Node old_first=first;
    if (first == last) {
      first= null;
      last= null;
    }
    else {
      first=first.next;
      first.prev=null;
    }
    return old_first.element;
  }

Node old_first=first; allows us to remember the reference to the node in front of the queue, since the manipulations that follow will effectively take that node out of the list. If the node in front of the queue is the last node in the queue, we set both first and last to null. Otherwise, we just make the node right after first, the first node by saying first=first.next;. We then use the statement first.prev=null; to set this newly appointed first node's prev variable to null. Now only old_first references the old first node. Once we return the element in the node referenced by old_first this node can be garbage collected.


Testing stringQueue

test1.java is designed to test either version of stringQueue. As you recall, stringQueue is an abstract class with no implementation of its own. We extended it to have a fixed (array) and a dynamic (linked list) implementation. It is designed so that user can specify a datafile, the version to test, and if necessary, the number of cells for the array in the "fixed" version. This information is taken from the command line. Here are two examples of running this program:

java test1 test1.dat fixed 10 will test the fixed version using test1.dat as the data file. Also, the size 10 is chosen here for the array that keeps the queue elements.

java test1 test1.dat dynamic will test the dynamic version using test1.dat as the data file. Notice that no size is specified, as it is not relevant in the dynamic case.

Command Line Parameters

Processing of command line parameters is mostly useful in utility programs, such as, mv and ls. In test1.java, it helps to test the different versions stringQueue quickly, without engaging in long dialog with the program to set things up.

In test1.java, the main method has an array of strings as a parameter (arg). This array is filled with the space separated strings on the command line that appear after the program name. arg[0] in the examples shown above is test1.dat. When looking at the code for this program, you will notice that the file opened for input gets its name from arg[0].

Comparing Objects

One of the things that I like you to notice is the use of the equals method when trying to figure out which version is being tested. You ask yourself, why couldn't we simply say if (arg[1] == dynamic). The reason is simple, "==" testes equality of the references and not what the objects contain. In other words, we can only find out if arg[1] and dynamic are both referencing the same memory cell.

Delayed Construction

Note that the declaration of q occurs before we know which kind the user is testing. Recall that you can't make objects of stringQueue, it is an abstract class. Once the program figures out which version is to be tested, either q = new dynamicStringQueue();, or q = new fixedStringQueue(size); are used to creat the right version of the queue.

Polymorphism

Consider the code segment:


      while(!q.empty()) {
        text = q.dequeue();
        System.out.println(text+" was removed from the queue");
      }

What if we want to take this out of the main method and put in its own method, say, named out. It would be invoked from main with a call like out(q);. Here is the catch, what kind parameter, do you think, out needs to accept? It needs to be stringQueue, as you don't know which implementation is used, so how can you use either of its subclasses. But, this not a problem, as this is one of the fundamental advantages for Object Orientation which we call Polymorphism. Here is a look at the out method:


static void out(stringQueue q) {
    while(!q.empty()) {
      String text = q.dequeue();
      System.out.println(text+" was removed from the queue");
    }
  }

Generic Stack

genStack.java is designed to allow for any object to be stored in the stack. As long as, the element being stored is an object of a class that is a descendant of the class Object, we can put the element on and later take it off the stack. For example, in test1.java we use a generic stack to hold strings.

Notice that an object of the Vector Class is used to hold our elements.

the one tricky issue that is easy to over look is how do you know what kind of an object you are poping off the stack? As I said earlier, the stack itself doesn't what you put on or take off. The test program used for testing this know puts strings on and thus know that it is taking strings off. But, the following syntax is necessary to successfully pop the strings off the stack: text = (String)s.pop();. The use of (String) identifies the object coming back from the stack as a string. Be aware, this technique is not for changing an apple into an orange, it is more like saying, I got a fruit and I know its an orange.