What is a data structure?
A data structure is a method for organising a set of data. The structure is defined by how the data is stored and how operations, such as data access, insertion and deletion, are performed on the stored data. Data structures are essential tools for programmers, as each structure has a set of benefits that make it useful for solving certain types of problem.
A linked list consists of a series of nodes. Each node stores a piece of data and a pointer that points to the next node in the list. There are two special pointers within the list:
- The head pointer, this points to the first node in the list. The head pointer is all that is needed to access the whole list.
- A null pointer, this indicates that the end of the list has been reached (If the head is a null pointer then this represents an empty list).
Because pointers are used to link the data together, there is no requirement for the list's data to be stored contiguously in memory.
The above code declares a template linked list class in C++. The only data that the class directly stores is a head pointer. The structure of the list node's also needs to be declared (but this could be moved out of the class). The class contains member functions to insert a new piece of data and access or delete the data at a certain position within the list.
The constructor just initialises the head pointer as a null pointer, i.e. a new list starts out as an empty list. The destructor is much more important. It traverses over the whole list and frees up the memory used to store each of the list's nodes. This is crucial as new list nodes will be created using dynamically allocated memory and hence they need deallocating when the list is destroyed.
The above code implements the insertion method for our linked list class. Firstly, it allocates memory to create a new node. It then sets the node's data to the user input data and sets the node to point to the current list. The list head is then changed to point to the new node.
This method of insertion inserts every element from the head position. Making this choice leads to insertions being very simple, only involving the assignment of pointers. This leads to the insertion operation scaling as O(1) i.e. insertion is very fast.
The above code implements the method for accessing data within our linked list class. It creates a local pointer to store the current node and an integer representing the current node's position within the list. The current pointer is initialised to the list head (which we choose to label as the 0th position). It is then repeatedly moved onto the next node. This loops until either the user's requested position or the end of the list has been reached. If the current pointer ends up as a null pointer then the user has requested a position that is out of the list's range and the programmer needs to choose how to deal with this. Otherwise, we simply return the data pointed to by the current pointer.
Accessing the first few lists elements will be fast, roughly O(1). However, the general access time scales as O(n), where n is the size of the list. This is because the list has to be traversed sequentially (in order from the start of the list). This will make accessing the data in long lists very slow.
The above code implements the deletion method for our linked list class. It is very similar to the previously implemented access method. It traverses the list in exactly the same way but also stores a pointer to the previously visited node. The previous node is then changed to point to the node after the current node (basically cutting the current node out of the list). When the head is deleted, there is no previous node so the head pointer is modified instead. The current node is then deleted from memory.
The time efficiency of deletion is identical to accessing list elements. The first few elements are O(1) but generally it is O(n).
Advantages and disadvantages
+ The linked list is a dynamic structure. It allows resizing at runtime and doesn't need an to be initialised as a specific size during compile time.
+ Insertion can be implemented quickly, independent of the data size.
+ Deletion could also be implemented quickly if elements were removed from the head.
- The list wastes some memory by storing a pointer alongside each piece of data.
- Elements must be accessed sequentially, much slower than random access.
- Reverse traversal (from back to front) of a linked list is very difficult.
Like all other data structures, a linked list should be used when it's attributes are suitable for the problem. Some examples being:
- To store a history of visited web pages. The most recently visited web page being added to the top of the list makes sense. Displaying the history will also naturally require the whole list to be accessed sequentially.
- To store the enemies within a video game. Essential operations such as collision checking will need to be performed on the whole group of enemies. Potential deletions would also be performed as a result of the collision check. Therefore, the lack of fast random access for specific enemies is not a problem.
- To implement other data structures. Linked lists can be used as the underlying implementation of other data structures, such as variable sized arrays, stacks and queues.
- Linked lists are often used to resolve collisions in hash tables.
Doubly linked list
A doubly linked list contains nodes that also store a pointer to the previous node. This removes the difficulty of reverse traversal, at the cost of storing an extra pointer within each node. Doubly linked lists will also typically store a tail pointer, pointing to the last element in the list. A tail pointer enables the possibility of fast insertion or deletion of elements from the tail.
Circularly linked list
Circularly linked lists are created by having a node that points back to a previous list node (creating a circular loop). This means that the list can be fully traversed by starting from any node, not just the head as in the case of a linear linked list. Circular linked lists are useful for applications where the list needs to cycled through repeatedly.
Test your knowledge
For each question, choose the best answer. The answer key is below.
- What does the condition (head == nullptr) represent?
- An error has occurred within the list.
- The list is currently empty.
- The list doesn't exist.
- Which operation is generally the fastest to perform on a linked list?
- Inserting new data.
- Accessing data.
- Deleting data.
- Which data structure is a more memory efficient method of storing data?
- An array.
- A linked list.
- The list is currently empty.
- Inserting new data.
- An array.
Alternative data structures
© 2018 Sam Brind