What is a data structure?
A data structure is a method for organising a set of data. The structure is defined by how the data is stored and how operations, such as data access, insertion and deletion, are performed on the stored data. Data structures are essential tools for programmers, as each structure has a set of benefits that make it useful for solving certain types of problem.
An array is used to store a fixed number of data elements of the same data type. A single block of memory is set aside to store the whole array. The array's data elements are then contiguously stored within the designated block.
Conceptually, an array is best thought of as a collection of items that are related in someway. For example, an array storing numbers that represent the value of the cards within your hand while playing poker. Arrays are the most commonly used data structure and as such are directly included within most programming languages.
Like any other variable, arrays should be initialised before being used in the program. C++ provides different methods to initialise an array. Each array element can be set manually by looping over each array index. Alternatively, an initialiser list can be used to initialise the whole array in a single line. Different variations of the initialiser list syntax are allowed, as shown in the code below. An empty list will initialise the array to contain zeros or specific values for each element can be specified.
Array elements are accessed through requesting an array index. In C++ this is done through the subscript operator, the syntax being: "Array_name[int index]". Arrays are zero-indexed, this means that the first element is given the index 0, the second element is given the index 1 and up to the last element being given an index equal to 1 less than the array's size.
Because the array's data is stored contiguously, it is simple for the computer to find the requested data element. The array variable stores the starting memory address of the array. This can then be moved forward by the requested index multiplied by the size of the data type stored in the array, reaching the starting address of the requested element. Storing the array as a block of memory also allows the computer to implement random access of individual elements, this is a fast operation, scaling as O(1).
Insertion and deletion
Inserting a new element or deleting a current array element isn't possible due to the restriction of the array being a fixed size. A new array (bigger or smaller by one element) would have to be created and the relevant elements copied over from the old array. This makes the operations inefficient and best handled by using a dynamic data structures instead of using an array.
Passing arrays to a function
In C++, the default method for passing parameters into functions is passing by value. You would then expect that passing an array would create a copy of the whole array. This isn't the case, instead the address of the first array element is passed by value. It is said that the array decays to a pointer (it can even be explicitly passed as a pointer). The decayed pointer no longer knows that it is meant to point to an array and any information relating to the array size is lost. This is why you will see most functions also taking a seperate array size variable. Care must also be taken as a non-constant pointer will allow modification of the array variables from within the function.
An array can also be passed by reference but the array size must be specified. This will pass the address of the first element by reference but it still retains the information that the pointer is pointing to an array. Due to the need to specify array size, this method is rarely used. In C++ 11, a standard library array class was introduced to deal with the issue of pointer decay.
Printing an array
Multidimensional arrays are arrays whose elements are also arrays. This allows increasingly complex structures to be created, but 2D arrays are the most commonly used. When accessing a multidimensional array, the subscript operators are evaluated from left to right.
A common use of a 2D array is to represent a matrix. The 2D array can be thought of a storing a collection of rows (or columns). Each of these rows is a 1D array of numbers.
Initialising a 3x3 identity matrix
Advantages and disadvantages
+ Arrays are the most efficient data structure for storing data. Only the data is stored and no extra memory is wasted.
+ Random access allows the fast access of individual data elements.
+ Multidimensional arrays are useful for representing complex structures.
- The size of the array needs to be declared at compile time (in advance of the program running).
- The array size is fixed and cannot be resized during runtime. This can lead to arrays being used that are oversized, to leave space for potential new elements but wasting memory on empty elements.
Arrays are ubiquitous in programming and can be used for almost any problem. However, the key to using data structures is to select the structure whose attributes best suit the problem. Some examples for arrays being:
- To store the objects placed on the board of a game. The board will always be a fixed size and fast access to a specific board space may be needed to modify the data stored there. For example, the user clicks an empty board space and the array element representing it needs to be changed from empty to full.
- To store a constant table of values. Arrays are the best option to store a constant set of values that will be looked up by the program. For example an array of the alphabetic characters, allowing the conversion of a number to a character by using it as an array index.
- As discussed previously, 2D arrays can store matrices.
The C++ STL (standard template library) contains an implementation of a dynamic array, known as a vector. The vector class removes the requirement of a fixed size by including methods to remove existing elements and add new elements. A very simple code example is included below to demonstrate these features.
Test your knowledge
For each question, choose the best answer. The answer key is below.
- Does an array waste any extra memory when storing data?
- Test would access which element of the Test array?
- The 3rd element.
- The 4th element.
- The 5th element.
- Which structure loses its size when passed to a function?
- The C++ built-in array
- The 4th element.
- The C++ built-in array
Alternative data structures
© 2018 Sam Brind