arrays

abstract data structure, array, python | 18 September 2022

array

Array as an abstract data type

As an abstract data type, an array is a list of similar data (sounds like circular definition). An array of an array is a 2-dimensional array, i.e. matrix.

  • name
  • type: the type of data is stored, and all have to be the same within one array
  • size: it is fixed once it is defined, and cannot be changed once created
  1. Access: random access use index as all elements are indexed, run time is \(O(1)\). This is the advantage of arrays.
  2. Search: \(O(n)\), may need to go over each element to find an item from an unsorted array
  3. Insert: \(O(n)\), because we need to shift the elements, which are after the index where we want to insert the item, to the ‘right’ for one space
  4. Delete: \(O(n)\), because we need to shift the elements, which are after the index where we want to insert the item, to the “left” for one space
  • Inserting element to the front of array is slow. It is faster to write to the back (append)
  • Deleting an element will require moving all subsequent elements. It is better to overwrite it so as not to disturb other elements
  • Sort at the end when returning if possible
  • Try to reduce space complexity to \(O(1)\) (in-place)

Array elements are stored in contiguous (continuous) memory locations. Its efficiency is in scaling its attributes to all its elements.

ArrayList is a derivation of array. It is a growing array, where the size can be changed. It has more functionality than an array, including the following 6 common methods:

  • add
  • remove
  • get
  • set
  • clear
  • toArray

Python list

In Python, arrays and arrayLists are grouped together into a single data structured called “Lists”. See the source code of Python list. Unlike arrays in some other languages, list length can be changed dynamically, and can hold different data types.

Note that the tuple type is very similar to the list type, except that tuple type is immutable. However, if a list is an element of a tuple, it is still mutable.

Numpy array is an multidimensional implementation of array. It is efficient both in time and space complexity.

Because of these flexbilities, Python lists have a higher costs for each element than numpy arrays, we need eight bytes for the reference.

In [1]: A = ['me', 'you', 1, 2, (1,2)]

In [2]: A[-1]
Out[2]: (1, 2)

In [3]: list(A[0])
Out[3]: ['m', 'e']

In [4]: A = [1]

In [5]: A*5
Out[5]: [1, 1, 1, 1, 1]
Action syntax
instantiating 1D array [0,1,2,3,4], list(range(5))
instantiating 2D array [[0,1,2,3,4],[1,2], [0]], also called “nested list”
access [0] gives the first element, [~0] reverse access
append .append()
insert .insert(2,100)
reverse .reverse() in-place, or reversed(list) returns an iterator

Slicing

Since accessing array is an \(O(1)\) operation, it is important to know how to access elements via slicing using : the slicing operator and other directional operators - and, to a less extent, ~ (reverse direction).

  • [x:y:z]: begin at x, end at y-1, step size z.
    For example [5:1:-2] means begin at index 5, end at index 1, with step size -2, i.e. the indices sliced are: 5, 3

  • [-1] is the last element of list

List comprehension

One special way to create lists is list comprehension.

In [6]: print([x**2 for x in range(6) if x%2 == 0])
Out[6]: [0 4 16]
  • Loop through rows of numpy array as a list of lists List comprehension can be used on numpy arrays.
    [i.mean() for i in A] iterates through elements (rows) of array A, and outputs the mean for each.
In [7]: import numpy as np
    ...: A = np.random.randint(10, size = (4,4))
    ...: print(A)
[[2 1 6 9]
 [3 9 6 8]
 [4 0 7 2]
 [8 6 4 6]]
In [8]: [i.mean() for i in A]
Out[8]: [4.5, 6.5, 3.25, 6.0]
  • Filtering List comprehension can be used to filter a list of strings.
In [9]: countries = ['USA', 'Uruguay','Uzbekistan','Venezuela','Vietnam','UFO']
    ...: [i for i in countries if i.lower().startswith('u') and i.lower().endswith('a') and len(i) > 2]
Out[9]: ['USA']
  • Access elements of elements List comprehension can be used to flatten a list of lists. [i for i in A] produces the same A. After the expression i is substituted by j, which iterates through i, we get the elements of the elements in one list.
    In [10]: A = [[1,2,3],[(3.14, 2.718281828459045)],['sky','earth']]
      ...: [i for i in A]
    Out[10]: [[1, 2, 3], [(3.14, 2.718281828459045)], ['sky', 'earth']]
    In [11]: [j for i in A for j in i]
    Out[11]: [1, 2, 3, (3.14, 2.718281828459045), 'sky', 'earth']
    
  • Test palindromic and other operations List comprehension is used in the following example to check if a string is palindromic.
In [12]: s = 'abcdcba'

In [13]: all(s[i] == s[~i] for i in range(len(s) // 2))
Out[13]: True
  • When NOT to use list comprehension List comprehension is very useful for small lists. But, do not use list comprehension when:
  • the desired output is not a list
  • the list is long, which costs too much in memory and run time
  • although list comprehensions can handle multiple layers of loops, more than 2 layers becomes difficult to understand

Implement the array ADT in Python

The following code are from Github repository “Data Structure using Python”. This implementation does not have any error handling. For example, error should be raised if you insert an item that is not consistent with the type requirement.

The magic or dunder methods are predefined methods that simplify many operations that can be performed on a class instance

__getitem__(self, key) Defines behavior for when an item is accessed, using the notation self[key]. This is also part of both the mutable and immutable container protocols. It should also raise appropriate exceptions: TypeError if the type of the key is wrong and KeyError if there is no corresponding value for the key.

__setitem__(self, key, value) Defines behavior for when an item is assigned to, using the notation self[nkey] = value. This is part of the mutable container protocol. Again, you should raise KeyError and TypeError where appropriate.

a = Array(10, int): we initialize an array of size 10, all zeros. Then we do a few inserts. The same result can also be accomplished by direct indexing such as a[2] = 2. Note that after all that inserting, the length of the Array object remains 10.

codeArray.py
class Array(object):
    ''' size: denotes the total size of the array to be initialized
       Type: denotes the data type of the array(as all the elements of the array have same data type)
       Items: values at each position of array
    '''
    def __init__(self, size, Type = int):
        self.size = len(list(map(Type, range(size))))
        self.Items =[Type(0)] * size    # initialize array with zeroes
        self.Type = Type

    def __str__(self): # for printing
        return ' '.join([str(i) for i in self.Items])

    def __len__(self):
        return len(self.Items)

    # magic methods to enable indexing
    def __setitem__(self, index, data):
        self.Items[index] = data

    def __getitem__(self, index):
        return self.Items[index]

    # function for search
    def search(self, keyToSearch):
        for i in range(self.size):
            if (self.Items[i] == keyToSearch):      # brute-forcing
                return i                                 # index at which element/ key was found

        return -1                                        # if key not found, return -1

    def insert(self, keyToInsert, position):
        """
        function for inserting an element
        keyToInsert, position
        """
        if(self.size > position):
            for i in range(self.size - 2, position - 1, -1):
                self.Items[i + 1] = self.Items[i]
            self.Items[position] = keyToInsert
        else:
            print('Array size is:', self.size)

    def delete(self, keyToDelete, position):
        """
        function to delete an element
        keyToDelete, position
        """
        if(self.size > position):
            for i in range(position, self.size - 1):
                self.Items[i] = self.Items[i + 1]
            self.Items[i + 1] = self.Type(0)
        else:
            print('Array size is:', self.size)

a = Array(10, int) # we initialize an array of size 10, all of them are zeros
print(a)
# 0 0 0 0 0 0 0 0 0 0
a.insert(2, 2) 
a.insert(3, 1)
a.insert(4,7)
print(a)
# 0 3 0 2 0 0 0 4 0 0
print(len(a))
# 10
idx = a.search(3)
idx
# 1

Let’s define more functionality for our Array: reversing, rotating and find missing.

codea1_reverseArry.py

def  reversingAnArray(start, end, myArray):
    while(start < end):
        myArray[start], myArray[end - 1] = myArray[end - 1], myArray[start]
        start += 1
        end -= 1

def rotation(rotateBy, myArray):
    for i in range(0, rotateBy):
        rotateOne(myArray)
    return myArray

def rotateOne(myArray):
    for i in range(len(myArray) - 1):
        myArray[i], myArray[i + 1] = myArray[i + 1], myArray[i]

def findMissing(myArray, n):
    n = n - 1
    totalSum = (n * (n + 1)) // 2
    for i in range(0, n):
        totalSum -= myArray[i]
    return totalSum

print('Array before Reversing:',a)
reversingAnArray(0, len(a), a)
print('Array after Reversing:',a)

myArray = Array(10)
for i in range(len(myArray)):
    myArray.insert(i, i)
print('Before Rotation:',myArray)
print('After Rotation:',rotation(3, myArray))
# OUTPUT:
# Before Rotation: 0 1 2 3 4 5 6 7 8 9
# After Rotation: 3 4 5 6 7 8 9 0 1 2

myArray = Array(10)
for i in range(len(myArray)):
    myArray.insert(i, i)
myArray.delete(4, 4)
print('Original Array:',myArray)
print('Missing Element:', findMissing(myArray, len(myArray)))

# OUTPUT:
# Original Array: 0 1 2 3 5 6 7 8 9 0
# Missing Element: 4