Twitter Snowflake ID Generator

Twitter snowflakes have its own algorithm with a 64-bit ID (half size comparing to Universally unique identifer (UUID))

The way it works is like this:

Pasted image 20230701105521.png

We have the following bits:

  1. Sign bit: 1 bit. Always 0. This will be reserved for future used. Could be use to distinguish between signed and unsigned
  2. Timestamp: 41 bits. Miliseconds since epoch. Which Twitter default snowflake epoch is 1288834974657Nov 04, 2010, 01:42:54 UTC
  3. Datacenter ID: 5 bits, give us $2^5 = 32$ datacenters
  4. Machine ID: 5 bits, give us $2^5 = 32$ machines per data center
  5. Sequence number: 12 bits. For every ID generated on that machine/process, sequence number increase by 1
    • Machine can support a maximum of $2^{12}=4096$ combinations per miliseconds.

Pros:

  • Smaller, only use 64 bits as opposed to Universally unique identifer (UUID)
  • ID is numerical only
  • ID has timestamp included
  • Can scale easily as multiple servers, the chance of them getting crash is very small

Cons:

  • Only last for 69 years: $2^41 - 1 = ‭2199023255551‬$ miliseconds = 69 years
    • After 69 years we will need to adapt other technique to migrate this
  • Server might not running in the same clock especially in multi threaded environment, we might need clock synchronisation