Back to the Basics: Data Structures - Arrays

Date Created
Sep 6, 2022 09:54 PM
Date Published
Sep 24, 2022
This is more of a note-taking space right now. I’m following along
lanepartonUpdated Sep 7, 2022

About Arrays:

Arrays (video)
  • contiguos area of memory consisting of equal-sized elements indexed by contiguous integers
  • Add/Remove to end: O(1)
  • Add/Remove at n: O(n)
UC Berkeley CS61B - Linear and Multi-Dim Arrays (video) (Start watching from 15m 32s)
  • Arrays (in Java): An object consisting of a numbered list of variables. Each is a primitive type or a reference to an object.
  • char[] c; // reference to array (any length) of characters
    • In Java, unlike C - you don’t have to decide the length yet
    • c = new char[4];
    • c[0] = 'b';
    • c[3] = 'e';
    • c[4] = 's'; // Run-time error (Java) array index out of bounds
    • (Java) double quotes for String, single quote for Characters
    • Field c.length defines the length of the array. You can’t assign a value to the field - c.length = 7; is a compile-time error
      • Strings have .length() - arrays have .length - in other words, Strings have a method, arrays have a field
      // Primes Revisited - Sieve of Eratosthenes // Loop through possible divisors, wipe out each number it divides public static void printPrimes(int n) { // n + 1 creates an array from 0 -> n boolean[] prime = new boolean[n + 1]; for(int i = 2; i <= n; i++) { prime[i] = true; } // Do all divisors that are <= n^2 // TODO: Why can we stop at root n? for(int divisor = 2; divisor * divisor <= n; divisor++) { if(primse[divisor]) { for(i = 2 * divisor; i <= n; i = i + divisor) { prime[i] = false; } } } for (i = 2; i <= n; i++) { if(prime[i]) { System.out.print(" " + i); } } } // TODO: Stack trace n = 10
  • Multi-dimensional arrays : an array of references to arrays
    • Pascals Triangle:
      • notion image
    • row i represents coefficients of
      • = coefficients → 1, 4, 6, 4, 1
      public static int[][] pascalTriangle(int n) { int[][] pt = new int[n][]; for(int i = 0; i < n; i++) { pt[i] = new int[i+1]; pt[i][0] = 1; for(int j = 1; j < i; j++) { pt[i][j] = pt[i-1][j-1] + pt[i -1][j]; } pt[i][j] = 1; } return pt; } // TODO: Stack trace 5
      I like this visual representation of pascals triangle in an array form.
      I like this visual representation of pascals triangle in an array form.
Dynamic Arrays (video)
  • Static arrays are static int my_array[100]
  • int *my_array = new int[size] - what if you don’t know the size at the time of creation?
  • Store a pointer to the dynamically allocated array and replace it with a newly-allocated array as needed.
  • Dynamic Array = Abstract data type with the following operations
    • Get(i) = O(1)
    • Set(i, val) = O(1)
      • Both constant time O(1)
    • PushBack(val) - add to end = O(n)
    • Remove(i) = O(n)
    • Size() = O(1)
    • How do we implement this?
      • We store:
        • arr: dynamically allocated array
        • capacity
        • size
    • So you basically create a new array with a new size when the array grows too large?
      • if size = capacity
    • C++ = vector
    • Java = ArrayList
    • Python = list
Jagged Arrays (video)
  • Jagged Array: Array of Arrays
  • int[][] ml = new int[4][]
  • Jagged Arrays have various rows of unequal amounts of day (rows with varying size)

Implement a vector (mutable array with automatic resizing):

What is a vector anyway?
TLDR: It’s a dynamic array with the ability to resize automatically
Vectors are the same as dynamic arrays with the ability to resize itself automatically when an element is inserted or deleted, with their storage being handled automatically by the container. Vector elements are placed in contiguous storage so that they can be accessed and traversed using iterators. In vectors, data is inserted at the end. Inserting at the end takes differential time, as sometimes the array may need to be extended. Removing the last element takes only constant time because no resizing happens. Inserting and erasing at the beginning or in the middle is linear in time.
What does this even mean?
They would start by hiding the defining a structure that would hold members necessary for the implementation. Then providing a group of functions that would manipulate the contents of the structure.
Practice coding using arrays and pointers, and pointer math to jump to an index instead of using indexing.
New raw data array with allocated memory
  • can allocate int array under the hood, just not use its features
  • start with 16, or if starting number is greater, use power of 2 - 16, 32, 64, 128
size() - number of items
capacity() - number of items it can hold
at(index) - returns item at given index, blows up if index out of bounds
insert(index, item) - inserts item at index, shifts that index's value and trailing elements to the right
prepend(item) - can use insert above at index 0
pop() - remove from end, return value
delete(index) - delete item at index, shifting all trailing elements left
remove(item) - looks for value and removes index holding it (even if in multiple places)
find(item) - looks for value and returns first index with that value, -1 if not found
resize(new_capacity) // private function
  • when you reach capacity, resize to double the size
  • when popping an item, if size is 1/4 of capacity, resize to half

Array Time

  • O(1) to add/remove at end (amortized for allocations for more space), index, or update
  • O(n) to insert/remove elsewhere

Array Space

  • contiguous in memory, so proximity helps performance
  • space needed = (array capacity, which is >= n) * size of item, but even if 2n, still O(n)