Java数据结构

Disjoint Sets (Union-Find)

connect(x,y)； isConnected(x, y)

public interface DisjointSets {
    /** connects two items P and Q */
    void connect(int p, int q);
    /** checks to see if two items are connected */
    boolean isConnected(int p, int q);
}

Quick Find: using single array of integers. id[i] records the set it belongs to.
Quick Union: record parent index
Weighted Quick Union: link the root of the smaller tree to the larger tree, ensure the maximum height of any tree is O(logN). Parent[root] = -sizeOfTree
https://github.com/Berkeley-CS61B-Student/sp19-s1365/blob/master/lab6/UnionFind.java

public class WeightedQuickUnionUF {
    private int parent[];
    private int count;

     /* Creates a UnionFind data structure holding n vertices. Initially, all
       vertices are in disjoint sets. */
    public WeightedQuickUnionUF(int N){
        count = N;
        id = new int[N];
        for(int i = 0; i < N; i++){
            id[i] = i;
        }
    }

    /* Throws an exception if v1 is not a valid index. */
    private void validate(int vertex) {
        int index = parent.length;
        if (vertex < 0 || vertex > index - 1) {
            throw new IllegalArgumentException("index " + vertex + " is not in 0 to " + index);
        }
    }

     /* Returns the size of the set v1 belongs to. */
    public int sizeOf(int v1) {
        return -1 * parent[find(v1)];
    }

    /* Returns true if nodes v1 and v2 are connected. */
    public boolean connected(int v1, int v2) {
        return find(v1) == find(v2);
    }

    /* Returns the root of the set V belongs to.*/
    public int find(int vertex){
        validate(vertex);
        if (parent[vertex] < 0) {
            return vertex;
        }
        while(parent[vertex] > 0){
           vertex = parent[vertex] ;
        }
        return vertex;
    }
    
    /* Connects two elements v1 and v2 together. v1 and v2 can be any valid 
       elements, and a union-by-size heuristic is used. If the sizes of the sets
       are equal, tie break by connecting v1's root to v2's root. Unioning a 
       vertex with itself or vertices that are already connected should not 
       change the sets but may alter the internal structure of the data. */
    public void union(int v1, int v2) {
        validate(v1);
        validate(v2);
        int rootOfV1 = find(v1);
        int rootOfV2 = find(v2);
        if (rootOfV1 == rootOfV2) {
            return;
        } else if (sizeOf(v1) < sizeOf(v2)) {
            parent[rootOfV2] -= sizeOf(v1);
            parent[rootOfV1] = rootOfV2;
        } else {
            parent[rootOfV1] -= sizeOf(v2);
            parent[rootOfV2] = rootOfV1;
        }
    }
    

}

Weighted Quick Union with Path Compression: Connect all the items along the way to the root to make our tree shorter.

 /* Returns the root of the set V belongs to. Path-compression is employed
       allowing for fast search-time. */
    public int find(int vertex) {
        validate(vertex);
        if (parent[vertex] < 0) {
            return vertex;
        } else {
            ArrayList toCompress = new ArrayList<>();
            int root = vertex;
            while (parent[root] > -1) {
                toCompress.add(root);
                root = parent[root];
            }
            for (int i : toCompress) {
                parent[i] = root;
            }
            return root;
        }
    }

Binary Search Tree

https://github.com/Berkeley-CS61B-Student/sp19-s1365/blob/master/lab7/BSTMap.java
find
insert

static BST insert(BST T, Key ik) {
    if (T == null) {
        return new BST(ik);
    }
    if (ik ≺ T.key) {
        T.left = insert(T.left, ik);
    }
    else if (ik ≻ T.key) {
        T.right = insert(T.right, ik);
    }
    return T;
}

delete: hibbard deletion - take the right-most node in the left subtree or the left-most node in the right subtree.

BSTs have best case height Θ(log N), and worst case height Θ(N).

Balanced Search Tree

B-Tree / 2-3-4 Tree / 2-3-Tree

2-3-4 tree can have 2, 3, or 4 children；2-3 tree can have 2 or 3.

Always Bushy
Add works by adding items to existing leaf nodes. If nodes are too full, they split
All leaves must be the same distance from the source.
A non-leaf node with k items must have exactly k + 1 children.
Runtime for operations is O(log N).
hard to implement

Rotating

private Node rotateRight(Node h) {
    // assert (h != null) && isRed(h.left);
    Node x = h.left;
    h.left = x.right;
    x.right = h;
    return x;
}
// make a right-leaning link lean to the left
private Node rotateLeft(Node h) {
    // assert (h != null) && isRed(h.right);
    Node x = h.right;
    h.right = x.left;
    x.left = h;
    return x;
}

Red-Black Trees

left-leaning red-black trees (LLRB) :

have a 1-1 correspondence with 2-3 trees
No node has 2 red links.
There are no red right-links.
Every path from root to leaf has same number of black links (because 2-3 trees have
same number of links to every leaf).

Insert:

Use a red link
If there is a right leaning “3-node”; Rotate left the appropriate node to fix.
If there are two consecutive left links; Rotate right the appropriate node to fix.
If there are any nodes with two red children, we have a temporary 4 Node; Color flip the node to emulate the split operation.

private Node put(Node h, Key key, Value val) {
    if (h == null) { return new Node(key, val, RED); }
    int cmp = key.compareTo(h.key);
    if (cmp < 0) { h.left = put(h.left, key, val); }
    else if (cmp > 0) { h.right = put(h.right, key, val); }
    else { h.val = val; }

    if (isRed(h.right) && !isRed(h.left)) { h = rotateLeft(h); }
    if (isRed(h.left) && isRed(h.left.left)) { h = rotateRight(h); }
    if (isRed(h.left) && isRed(h.right)) { flipColors(h); }
    return h;
}

For trees, element / keys need to be comparable.

Hashing & HashTable

computing the hash code of the object：

It must be an Integer
If we run .hashCode() on an object twice, it should return the same number
Two objects that are considered .equal() must have the same hash code （hashcode相等不一定equal， equal一定hashcode相等）

add item:

Get hashcode (i.e., index) of item. (index = hashcode % bucketsize_M)
If index has no item, create new List, and place item there.
If index has a List already, check the List to see if item is already in there. If not, add item to List.

contains item:

Get hashcode (i.e., index) of item.
If index is empty, return false .
Otherwise, check all items in the List at that index, and if the item exists, return true .

HashTable
https://github.com/Berkeley-CS61B-Student/sp19-s1365/blob/master/lab8/MyHashMap.java
An array of buckets(M), all items N ( buckets can be done using: ArrayList, Resizing Array, Linked List, or BST.)
load factor = N/M; try to keep load factors low for runtime.
Dynamic growing: Create a new HashTable with 2M buckets. Iterate through all the items in the old HashTable, and add them into this new HashTable one by one.

Priority Queue

an Abstract Data Type that optimizes for handling minimum or maximum elements.
Many ways to implement: Heap; LLRB; Ordered Array; Hash Table

Heap

Min-heap: Every node is less than or equal to both of its children
Complete: Missing items only at the bottom level (if any), all nodes are as far left as possible.

The three methods we care about for the PriorityQueue ADT:

add : Add to the end of heap temporarily. Swim up the hierarchy to the proper place
getSmallest : Return the root of the heap (This is guaranteed to be the minimum by our min-heap property
removeSmallest : Swap the last item in the heap into the root. Sink down the hierarchy to the proper place.

Tries:

https://github.com/Berkeley-CS61B-Student/sp19-s1365/blob/master/lab9/MyTrieSet.java
a specific implementation for Sets and Maps that is specialized for strings.
useful data structure used in cases where keys can be broken into "characters" and share prefixes with other keys

Every node stores only one letter
Nodes can be shared by multiple keys
marking the color of the last character of each string to be blue.
two cases when we wouldn't be able to find a string: the final node is white, or we fall off the tree.
There are more implementations for how to store the children of every node of the trie,specifically three:
DataIndexedCharMap (Con: excessive use of space, Pro: speed efficient)
Bushy BST (Con: slower child search, Pro: space efficient)
Hash Table (Con: higher cost per link, Pro: space efficient)

Quad Trees & K-D Trees

2-D case, compare x - compare y - compare x - y - x - .....
3-D case, it rotates between each of the three dimensions every three levels
and so on and so forth for even higher dimensions.
Find Nearest Neighbor using a K-D Tree.
https://github.com/Berkeley-CS61B-Student/sp19-s1365/blob/master/proj2ab/bearmaps/KDTree.java