DATA STRUCTURES

INTRODUCTION

Fixed-size structures such as single-subscripted arrays, double-sub-scripted arrays and structs. Dynamic data structures that grow and shrink during execution. Linked lists are collections of data items "lined up in a row: insertions and removals are made anywhere in a linked list. Stacks are important in compilers and operating system—insertions and removals are mode only at one end of at stack—its top. Queues represent waiting lines; insertions are made at the back (also referred to as the tail) of a queue and removals are made from the front (also referred to as the head) of a queue. Binary trees facilitate high-speed searching and sorting of data, efficient elimination of duplicate data items, representing file system directories and compiling expressions into machine language.

SELF-REFERENTIAL CLASSES

A self-referential class contains a pointer member that points to a class object of the same class type.

Self-referential class objects can be linked together to form useful data structures such as list, queues, stacks, and trees.

DYNAMIC MEMORY ALLOCATION creating and maintaining dynamic data structures requires dynamic memory allocation—the ability for a program to obtain more memory space at execution time to hold new nodes and to release space no longer needed. The limit for dynamic memory allocation can be as large as the amount of available physical memory in the computer or the amount of available virtual memory in a virtual memory system. Often, the limits are much smaller because available memory must be shared among many users.

LINKED LISTS

A linked list is a linear collection of self-referential class objects, called nodes, connected by pointer links—hence, the term ‘linked’ list. A linked list is accessed via a pointer to the first node of the list. Subsequent nodes are accessed via the link-pointer member stored in each node. By convention, the link pointer in the last node of a list is set to null (zero) to mark the end of the list. Data are stored in a linked list dynamically—each node is created as necessary. A node can contain data of any type including objects of other classes. If nodes contain base-class pointers or base class references to base-class and derived-class objects related by inheritance, we can have a linked list of such nodes and use virtual function calls to process these objects polymorphically. Stack and queues are also linear data structures, and as we will see are constrained versions of linked lists. Trees are nonlinear data structures.

Lists of data can be stored in arrays, but linked lists provide several advantages. A linked list is appropriate when the number of data elements to be represented in the data structure at one is unpredictable. Linked lists are dynamic, so the length of a list can be increase or decrease as necessary. The size of a ‘conventional array, however, cannot be altered, because the array size is fixed at compile time. "conventional" arrays can become full. Linked lists become full only when the system has insufficient memory to satisfy dynamic storage allocation requests.

An array can be declared to contain more elements than the number of items expected, but this can waste memory . Linked lists can provide better memory utilization in these situations. Linked lists allow the program to adapt at run time.

Insertion and deletion in a sorted array can be time consuming—all the elements following the inserted or deleted element must be shifted appropriately.

The elements of an array are stored contiguously in memory. This allows immediate access to any array element because the address of any element can be calculated directly based on its position relative to the beginning of the array. Linked lists do no afford such immediate "direct access" to their elements.

Linked list nodes are normally not stored contiguously in memory. Logically, however, the nodes of a linked list appear to be contiguous.

Using dynamic memory allocation (instead of arrays) for data structures that grow and shrink at execution time can save memory. Pointers occupy space and that dynamic memory allocation incurs the overhead of function calls.

Assign null (zero) to the link member of a new node Pointers should be initialized before they are used.

The kind of linked list we have been discussing is a singly-linked-list—the list begins with a pointer to the first node and each node contains a pointer to the next node "in sequence". This list terminates with a node whose pointer member is 0. A singly-linked-list may be traversed in only one direction.

A circular, singly-linked list begins with a pointer to the first node and each node contains a pointer to the next node. The "last node" does not contains a 0 pointer; rather , the pointer in the last node points back to the first node, thus closing the "circle".

A doubly linked list allows traversals both forwards and backwards. Such a list is often implemented with two "start pointers"—one that points to the first element of the list to allow front-to-back traversal of the list, and one that points to the last element of the list to allow back-to-front traversal of the list. Each node has both a forward pointer to the next node in the list in the forward direction and a backward pointer to the next node in the list in the backward direction.

In a circular, doubly-linked list, the forward pointer of the last node pointers to the first node and the backward pointer of the first node points to the last node, thus closing the "circle".

STACKS

A stack is a constrained version of a linked list—new nodes can be added to a stack and removed from a stack only at the top. A stack is referred to as a LIFO data structure. The link member in the list node of the stack is set to null (zero) to indicate the bottom of the stack.

Not setting the link in the bottom node of a stack to null (zero) is a programming error.

The primary member functions used to manipulate a stack are push and pop. Function push adds a new node to the top of the stack. Function pop removes a node from the top of the stack, stores the popped value in a reference variable that is passed to the calling function, and returns true if the pop operation was successful.

Stack have many interesting applications. When a function call is made, the called function must know how to return to its caller, so the return address is pushed onto a stack. If a series of function calls occurs, the successive return values are pushed onto the stack in LIFO order so that each function can return to its caller. Stacks support recursive function calls in the same manner as conventional nonrecursive calls.

When the function returns to it caller, the space for that function’s automatic variables is popped off the stack, and those variables are no longer known to the program.

Stacks are used by compilers in the process of evaluating expressions and generating machine language code.

QUEUES

A queue is similar to a supermarket checkout line—the first person in line is serviced first and other customers enter the line at the end and wait to be serviced. Queue nodes are removed only from the head of the queue and are inserted only at the tail of the queue. A queue is referred to as FIFO data structure.

Queues have many applications in computer systems. Most computers have only a single processor, so only one user at a time can be serviced. Entries for the other users are places in a queue. Each entry gradually advances to the front of the queue as users receive service. The entry at the front of the queue is the next to receive service.

Queues are also used to support print spooling. A multi-user environment may have only a single printer. Many users may be generating outputs to be printed. If the printer is busy, other outputs may still be generated. These are "spooled " to disk where they wait in a queue until the printer become available.

TREES

Linked lists, stacks, and queues are linear data structures. A tree in a nonlinear, two-dimensional data structure with special properties. Tree nodes contain two or more links. Binary trees—trees whose nodes all contain two links (none, one or both of, which may be null). The root node is the first node in a tree. Each link in the root node refers to a child. The left child is the root node of the left subtree, and the right child is the root node of the right subtree. The children of a node are called siblings. A node with no children is called a leaf node. Normally draw trees from the root node down—exactly the opposite of trees in nature.

A binary search tree With no duplicate node values) has the characteristic that the value in any left subtree are less than the value in its parent node, and the values in any right subtree are greater than the value in its parent node. A binary search tree and traverses three ways—using recursive inorder, preorder, and postorder traversals.

A node can only be inserted as a leaf node in a binary search tree. If the tree is empty an new node is created, initialized, and inserted in the tree.

In order Traversal of a binary search tree prints the node values in ascending order. The process of creating a binary search tree actually sorts the data—and thus this process is called the binary tree sort. The steps are

  1. Process the value in the node.
  2. Traverse the left subtree with a preordertraversal.
  3. Traverse the right subtree with a preordertraversal.

The binary search tree facilitates duplicate elimination. As the tree is being created, an attempt to insert a duplicate value will be recognized because a duplicate will follow the same " go left" or " go right" decisions on each comparison as the original value did. The duplicate will eventually be compared with a node contains the same value. The duplicate value may simply be discarded at this point.

Searching a binary tree for a value that matches a key value is also fast. If the tree is tightly packed, then each level contains about twice as many elements as the previous level.

SUMMARY

COMMON PROGRAMMING ERRORS