Database Index (Secondary Key)
Database indexes can help to fasten up the database query.
Anology
Consider a book of 1000 pages with 10 chapters, each chapter 100 pages.
If you wanna find the word "Alchemist" — without index page, you would have to scan through the entire book/chapters i.e 1000 pages
With an index page you know where to go. To look up for a chapter you look at the index page and jump to that page.
However, on top of 1000 pages, you need around 10 pages to show the index, similar to how database index works.
Technical
When assigning more index to our database, we essentially are creating another sorted database just contain the key. Note that our database is essentially just disk storage.
Considering as below:
Field name Data type Size on disk
id (Primary key) Unsigned INT 4 bytes
firstName Char(50) 50 bytes
lastName Char(50) 50 bytes
emailAddress Char(100) 100 bytes
Supposed we have these following fields and our query are mainly findById()
and findByFirstName()
.
For findById
it's easy because it's incremental, we can simply perform Binary Search with $O(logn)$ time complexity.
However for findByFirstName
we need to traverse all the blocks in order to find the record that we need. Imagine we have 5,000,000
records that stored in 1,000,000
disk blocks (1 disk block contains 5 records)
Assigning an index
Now if we assign an index to firstName
, we essentially creating another table with the follow values
Field name Data type Size on disk
firstName Char(50) 50 bytes
(record pointer) Special 4 bytes
The record pointer
points to the row of the original table.
This table will have firstName
sorted. As a result, we can perform Binary Search again and reduce to $O(logn)$.
However, the database will need to create 1 more table and it takes up more disk space. As a result too many indices is not good.
[!note]
Creating an index will affect write because we need to build the B+ tree.You can create multiple indices per database