What Are Data Structures? Arrays

Updated on May 15, 2018

What is a data structure?

A data structure is a method for organising a set of data. The structure is defined by how the data is stored and how operations, such as data access, insertion and deletion, are performed on the stored data. Data structures are essential tools for programmers, as each structure has a set of benefits that make it useful for solving certain types of problem.

Arrays

General idea

An array is used to store a fixed number of data elements of the same data type. A single block of memory is set aside to store the whole array. The array's data elements are then contiguously stored within the designated block.

Conceptually, an array is best thought of as a collection of items that are related in someway. For example, an array storing numbers that represent the value of the cards within your hand while playing poker. Arrays are the most commonly used data structure and as such are directly included within most programming languages.

An example array, called Numbers, storing five integers. Stored data is coloured blue.
An example array, called Numbers, storing five integers. Stored data is coloured blue.

Initialisation

Like any other variable, arrays should be initialised before being used in the program. C++ provides different methods to initialise an array. Each array element can be set manually by looping over each array index. Alternatively, an initialiser list can be used to initialise the whole array in a single line. Different variations of the initialiser list syntax are allowed, as shown in the code below. An empty list will initialise the array to contain zeros or specific values for each element can be specified.

//Declaration without initialisation
int test1[4]; //test1 = [four unknown values]
//Manually setting each value
for(int i{0}; i < 4; i++){
	test1[i] = i + 1;
}
//test1 = [1,2,3,4]

//Using an initialiser list
int test2[4] {}; //test2 = [0,0,0,0]
int test3[4] {1,2,3,4}; //test3 = [1,2,3,4]
int test4[4] {1}; //test4 = [1,0,0,0]
int test5[] {1,2,3,4}; //test5 = [1,2,3,4]

Accessing data

Array elements are accessed through requesting an array index. In C++ this is done through the subscript operator, the syntax being: "Array_name[int index]". Arrays are zero-indexed, this means that the first element is given the index 0, the second element is given the index 1 and up to the last element being given an index equal to 1 less than the array's size.

Because the array's data is stored contiguously, it is simple for the computer to find the requested data element. The array variable stores the starting memory address of the array. This can then be moved forward by the requested index multiplied by the size of the data type stored in the array, reaching the starting address of the requested element. Storing the array as a block of memory also allows the computer to implement random access of individual elements, this is a fast operation, scaling as O(1).

Insertion and deletion

Inserting a new element or deleting a current array element isn't possible due to the restriction of the array being a fixed size. A new array (bigger or smaller by one element) would have to be created and the relevant elements copied over from the old array. This makes the operations inefficient and best handled by using a dynamic data structures instead of using an array.

Passing arrays to a function

In C++, the default method for passing parameters into functions is passing by value. You would then expect that passing an array would create a copy of the whole array. This isn't the case, instead the address of the first array element is passed by value. It is said that the array decays to a pointer (it can even be explicitly passed as a pointer). The decayed pointer no longer knows that it is meant to point to an array and any information relating to the array size is lost. This is why you will see most functions also taking a seperate array size variable. Care must also be taken as a non-constant pointer will allow modification of the array variables from within the function.

An array can also be passed by reference but the array size must be specified. This will pass the address of the first element by reference but it still retains the information that the pointer is pointing to an array. Due to the need to specify array size, this method is rarely used. In C++ 11, a standard library array class was introduced to deal with the issue of pointer decay.

Printing an array

#include <iostream>
//Pass array address by value (implicit pointer)
void Print1(const int arr[], int size) {
	std::cout << "[";
	for (int i{ 0 }; i < size; i++) {
		std::cout << arr[i];
		if (i < size - 1) { std::cout << ","; }
	}
	std::cout << "]";
}
//Pass array address by value (explicit pointer)
void Print2(const int* arr, int size) {
	std::cout << "[";
	for (int i{ 0 }; i < size; i++) {
		std::cout << arr[i];
		if (i < size - 1) { std::cout << ","; }
	}
	std::cout << "]";
}
//Pass array address by reference (size must be specified)
void Print3(const int(&arr)[5]) {
	//size of array is preserved and can be used
	int size{ sizeof(arr) / sizeof(arr[0]) };
	std::cout << "[";
	for (int i{ 0 }; i < size; i++) {
		std::cout << arr[i];
		if (i < size - 1) { std::cout << ","; }
	}
	std::cout << "]";
}

int main(){
	int numbers[5]{1,2,3,4,5};
	Print1(numbers);
	Print2(numbers);
	Print3(numbers);
	return 0;	
}

Multidimensional arrays

Multidimensional arrays are arrays whose elements are also arrays. This allows increasingly complex structures to be created, but 2D arrays are the most commonly used. When accessing a multidimensional array, the subscript operators are evaluated from left to right.

A common use of a 2D array is to represent a matrix. The 2D array can be thought of a storing a collection of rows (or columns). Each of these rows is a 1D array of numbers.

An example 2D array of integers, which could be used to represent a 3x5 matrix. The chosen visual layout clearly demonstrates how it is analogous to a matrix. However, the computer would store the numbers as a single, contiguous block of memory.
An example 2D array of integers, which could be used to represent a 3x5 matrix. The chosen visual layout clearly demonstrates how it is analogous to a matrix. However, the computer would store the numbers as a single, contiguous block of memory.

Initialising a 3x3 identity matrix

const int size{3};
int identity[size][size];
for(int i{0}; i < size; i++){
	for(int j{0}; j < size; j++){
		if(i == j){
			identity[i][j] = 1;
		} else {
			identity[i][j] = 0;
		}
	}
}

Advantages and disadvantages

+ Arrays are the most efficient data structure for storing data. Only the data is stored and no extra memory is wasted.

+ Random access allows the fast access of individual data elements.

+ Multidimensional arrays are useful for representing complex structures.

- The size of the array needs to be declared at compile time (in advance of the program running).

- The array size is fixed and cannot be resized during runtime. This can lead to arrays being used that are oversized, to leave space for potential new elements but wasting memory on empty elements.

Uses

Arrays are ubiquitous in programming and can be used for almost any problem. However, the key to using data structures is to select the structure whose attributes best suit the problem. Some examples for arrays being:

  • To store the objects placed on the board of a game. The board will always be a fixed size and fast access to a specific board space may be needed to modify the data stored there. For example, the user clicks an empty board space and the array element representing it needs to be changed from empty to full.
  • To store a constant table of values. Arrays are the best option to store a constant set of values that will be looked up by the program. For example an array of the alphabetic characters, allowing the conversion of a number to a character by using it as an array index.
  • As discussed previously, 2D arrays can store matrices.

Dynamic arrays

The C++ STL (standard template library) contains an implementation of a dynamic array, known as a vector. The vector class removes the requirement of a fixed size by including methods to remove existing elements and add new elements. A very simple code example is included below to demonstrate these features.

#include <vector>
#include <string>

int main(){
	std::vector<std::string> fruits
	{"apple","banana","blueberry","cucumber"};
	//fruits = ["apple","banana","blueberry","cucumber"]
	fruits.pop_back();
	//fruits = ["apple","banana","blueberry"]
	fruits.push_back("kiwi");
	//fruits = ["apple","banana","blueberry","kiwi"]
	return 0;
}

Test your knowledge

view quiz statistics

Alternative data structures

Questions & Answers

    © 2018 Sam Brind

    Comments

      0 of 8192 characters used
      Post Comment

      No comments yet.

      working

      This website uses cookies

      As a user in the EEA, your approval is needed on a few things. To provide a better website experience, owlcation.com uses cookies (and other similar technologies) and may collect, process, and share personal data. Please choose which areas of our service you consent to our doing so.

      For more information on managing or withdrawing consents and how we handle data, visit our Privacy Policy at: https://owlcation.com/privacy-policy#gdpr

      Show Details
      Necessary
      HubPages Device IDThis is used to identify particular browsers or devices when the access the service, and is used for security reasons.
      LoginThis is necessary to sign in to the HubPages Service.
      Google RecaptchaThis is used to prevent bots and spam. (Privacy Policy)
      AkismetThis is used to detect comment spam. (Privacy Policy)
      HubPages Google AnalyticsThis is used to provide data on traffic to our website, all personally identifyable data is anonymized. (Privacy Policy)
      HubPages Traffic PixelThis is used to collect data on traffic to articles and other pages on our site. Unless you are signed in to a HubPages account, all personally identifiable information is anonymized.
      Amazon Web ServicesThis is a cloud services platform that we used to host our service. (Privacy Policy)
      CloudflareThis is a cloud CDN service that we use to efficiently deliver files required for our service to operate such as javascript, cascading style sheets, images, and videos. (Privacy Policy)
      Google Hosted LibrariesJavascript software libraries such as jQuery are loaded at endpoints on the googleapis.com or gstatic.com domains, for performance and efficiency reasons. (Privacy Policy)
      Features
      Google Custom SearchThis is feature allows you to search the site. (Privacy Policy)
      Google MapsSome articles have Google Maps embedded in them. (Privacy Policy)
      Google ChartsThis is used to display charts and graphs on articles and the author center. (Privacy Policy)
      Google AdSense Host APIThis service allows you to sign up for or associate a Google AdSense account with HubPages, so that you can earn money from ads on your articles. No data is shared unless you engage with this feature. (Privacy Policy)
      Google YouTubeSome articles have YouTube videos embedded in them. (Privacy Policy)
      VimeoSome articles have Vimeo videos embedded in them. (Privacy Policy)
      PaypalThis is used for a registered author who enrolls in the HubPages Earnings program and requests to be paid via PayPal. No data is shared with Paypal unless you engage with this feature. (Privacy Policy)
      Facebook LoginYou can use this to streamline signing up for, or signing in to your Hubpages account. No data is shared with Facebook unless you engage with this feature. (Privacy Policy)
      MavenThis supports the Maven widget and search functionality. (Privacy Policy)
      Marketing
      Google AdSenseThis is an ad network. (Privacy Policy)
      Google DoubleClickGoogle provides ad serving technology and runs an ad network. (Privacy Policy)
      Index ExchangeThis is an ad network. (Privacy Policy)
      SovrnThis is an ad network. (Privacy Policy)
      Facebook AdsThis is an ad network. (Privacy Policy)
      Amazon Unified Ad MarketplaceThis is an ad network. (Privacy Policy)
      AppNexusThis is an ad network. (Privacy Policy)
      OpenxThis is an ad network. (Privacy Policy)
      Rubicon ProjectThis is an ad network. (Privacy Policy)
      TripleLiftThis is an ad network. (Privacy Policy)
      Say MediaWe partner with Say Media to deliver ad campaigns on our sites. (Privacy Policy)
      Remarketing PixelsWe may use remarketing pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to advertise the HubPages Service to people that have visited our sites.
      Conversion Tracking PixelsWe may use conversion tracking pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to identify when an advertisement has successfully resulted in the desired action, such as signing up for the HubPages Service or publishing an article on the HubPages Service.
      Statistics
      Author Google AnalyticsThis is used to provide traffic data and reports to the authors of articles on the HubPages Service. (Privacy Policy)
      ComscoreComScore is a media measurement and analytics company providing marketing data and analytics to enterprises, media and advertising agencies, and publishers. Non-consent will result in ComScore only processing obfuscated personal data. (Privacy Policy)
      Amazon Tracking PixelSome articles display amazon products as part of the Amazon Affiliate program, this pixel provides traffic statistics for those products (Privacy Policy)