Count-min Sketch (Cms)
Use to estimate the frequency stream of data with a fixed memory. For example if we want to count the frequency of this stream of data:
A A A B C
So it should return A: 3, B: 1
Algorithm
Count-min sketch can be constructed using a 2 dimensional array with width normally very big (1000) and height very small represent the number of hash function
To add the frequency
For example in this case, we have 5 hash functions and 7 columns:

When a new element comes, we calculate the hash value of each of the 5 hash function and add 1 to the correspondent cell (maybe with %)
For example A comes, we have the following

When another A comes, we have

Same for another A

Then B arrives, however we have a hash collision that will accidentally increase the frequency of A

then C arrives and we have another collision with B and A:

To retrieve the frequency
To receive A, we found the minimum of all the counters. In this case should be 3. That's why we need multiple hash functions.
Choosing width and height
$$width = \text{Math.ceil}(\frac{2.0}{e})$$
$$height = \text{Math.ceil}(-\frac{\text{Math.log}(1-d)} {\text{Math.log}(2)})$$
In which
- $e$: the accuracy we want to have
- $d$: the certainty which we reach the accuracy
Use cases
- Top K elements
- Keep a heap of top-k elements with the given Count-min sketch
- Calculate frequency with fixed size hash