Database Indexing
In a 35 million rows. For example
postgres@localhost:pitchfork> select * from artists_big where reviewid = 12345;
The result was limited to 1000 rows
Time: 1.001s (1 second), executed in: 0.949s (0.949 second)
it will be 1 second (postgres is fast).
When creating index based on reviewid
(reviewid)
After creating an index
create index reviewid_index on artists_big (reviewid);
postgres@localhost:pitchfork> create index reviewid_index on artists_big (reviewid);
CREATE INDEX
Time: 18.142s (18 seconds), executed in: 18.142s (18 seconds)
postgres@localhost:pitchfork> select * from artists_big where reviewid = 12345;
The result was limited to 1000 rows
Time: 0.060s
We see a big improvement. However if we do query with another field (artist
) it will take longer
postgres@localhost:pitchfork> select * from artists_big where reviewid > 12345 and artist = 'Radiohead';
+----------+--------+
| reviewid | artist |
|----------+--------|
+----------+--------+
SELECT 0
Time: 1.205s (1 second), executed in: 1.200s (1 second)
As a result, we can index by both reviewid
and artist
postgres@localhost:pitchfork> create index reviewid_artist_index on artists_big (reviewid, artist);
CREATE INDEX
Time: 31.841s (31 seconds), executed in: 31.841s (31 seconds)
postgres@localhost:pitchfork> select * from artists_big where reviewid > 12345 and artist = 'Radiohead';
+----------+--------+
| reviewid | artist |
|----------+--------|
+----------+--------+
SELECT 0
Time: 0.083s
as a result it's fast again. We can also drop the reviewid
index only
drop index reviewid_index;
And it's still as fast for single query
select * from artists_big where reviewid = 12345;
The result was limited to 1000 rows
Time: 0.054s