Algorithms and Data Structures/Data Structures

[Data Structures] Arrays and Strings

brightlightkim 2022. 5. 18. 12:27

Hash Tables

A data structure that maps keys to values for highly efficient lookup.

 

1. Compute the key's hash code, usually an int or long. Note that two different keys could have the same hash code, as there may be an infinite number of keys and a finite number of ints

 

2. Then, map the hash code to an index in the array. This could be done with something like hash(key) % array_length. Two different hash codes could, of course, map to the same index.

 

3. At this index, there is a linked list of keys and values. Store the key and value in this index. We must use a linked list because of collisions. You could have two different keys with the same hash code or two different hash codes that map to the same index.

 

To retrieve the value pair by its key, you repeat this process. Compute the hash code from the key, and then compute the index from the hash code. Then, search through the liked list for the value with this key.

 

If the collisions are very high, the worst-case runtime is O(N), where N is the number of keys. However, we generally assume a good implementation that keeps collisions to a minimum, in which case the lookup time is O(1)

 

Alternatively, we can implement the hash table with a balanced binary search tree. This gives us an O(log N) lookup time. The advantage of this is potentially using less space since we no longer allocate a large array. We can also iterate through the keys in order, which can be useful sometimes.

 

ArrayList and Resizable Arrays

In some languages, arrays are automatically resizable. The array or list will grow as you append items. In other languages, like Java, arrays are fixed lengths. The size is defined when you create the array.

 

When you need an array-like data structure that offers dynamic resizing, you usually use an ArrayList. An ArrayList is an array that resizes itself as needed while providing O(1) access. A typical implementation is that when the array is full, the array doubles in size. Each doubling takes O(n) time but rarely happens that is amortized insertion time is still O(1).

 

ArrayList<String> merge(String[] words, String[] more){
    ArrayList<String> sentence = new ArrayList<String>();
    for (String w : words) sentence.add(w);
    for (String w : more) sentence.add(w);
    return sentence;
}

 

StringBuilder

Concatenating Strings

ArrayList<String> merge(String[] words, String[] more){
    ArrayList<String> sentence = new ArrayList<String>();
    for (String w : words) sentence.add(w);
    for (String w : more) sentence.add(w);
    return sentence;
}

On each concatenation, a new copy of the string is created, and the two strings are copied over. character by character. The first iteration requires us to copy x characters. The second iteration requires copying 2x characters. The third iteration requires 3x and so on. The total time, therefore, is O(x + 2x + ... + nx).

This reduces to O(xn^2)

 

1+2+3+ ... + n = n(n+1)/2

 

StringBuilder can help you avoid this problem. StringBuilder simply creates a resizable array of all the strings, copying them back to a string only when necessary.

 

String joinWords(String[] words) {
    StringBuilder sentence = new StringBuilder();
    for (String w : words) {
        sentence.append(w);
    }
    return sentence.toString();
}