Bloom Filter
Is a probabilistic datastructure that used probability to determine if a particular item in the set.
Under the hood, it's just a bit array with 0
and 1
with a fixed length, for example in here is 18.
We have 2 operation: Insert and Look up
Also built-in supports for Redis, Postgres, …
Use cases
- Checking for existing in database
- Deduplication
Insert
We perform insert with constant space and time complexity:
- Hash the input value
- Mod the result by the length of the array
- Set corresponding bit to 1
Lookup
We perform lookup at a constant space and time as well
- Hash the input value
- Mod the result by the length of the array
- Check if the corresponding bit is 0 or 1
If it's 0
that means it's not there.
In this example, the bit at the 10_th_ index is 0, which indicates that the given string isn’t present in the Bloom filter.