Twitter Snowflake ID Generator
Twitter snowflakes have its own algorithm with a 64-bit ID (half size comparing to Universally unique identifer (UUID))
The way it works is like this:
We have the following bits:
- Sign bit: 1 bit. Always
0
. This will be reserved for future used. Could be use to distinguish between signed and unsigned - Timestamp: 41 bits. Miliseconds since epoch. Which Twitter default snowflake epoch is
1288834974657
—Nov 04, 2010, 01:42:54 UTC
- Datacenter ID: 5 bits, give us $2^5 = 32$ datacenters
- Machine ID: 5 bits, give us $2^5 = 32$ machines per data center
- Sequence number: 12 bits. For every ID generated on that machine/process, sequence number increase by 1
- Machine can support a maximum of $2^{12}=4096$ combinations per miliseconds.
Pros:
- Smaller, only use 64 bits as opposed to Universally unique identifer (UUID)
- ID is numerical only
- ID has timestamp included
- Can scale easily as multiple servers, the chance of them getting crash is very small
Cons:
- Only last for 69 years: $2^41 - 1 = 2199023255551$ miliseconds = 69 years
- After 69 years we will need to adapt other technique to migrate this
- Server might not running in the same clock especially in multi threaded environment, we might need clock synchronisation
- We can use Network Time Protocol (NTP) as a solution