Database Indexing

In a 35 million rows. For example

postgres@localhost:pitchfork> select * from artists_big where reviewid = 12345;
The result was limited to 1000 rows
Time: 1.001s (1 second), executed in: 0.949s (0.949 second)

it will be 1 second (postgres is fast).

When creating index based on reviewid (reviewid)

After creating an index

create index reviewid_index on artists_big (reviewid);
postgres@localhost:pitchfork> create index reviewid_index on artists_big (reviewid);
CREATE INDEX
Time: 18.142s (18 seconds), executed in: 18.142s (18 seconds)
postgres@localhost:pitchfork> select * from artists_big where reviewid = 12345;
The result was limited to 1000 rows
Time: 0.060s

We see a big improvement. However if we do query with another field (artist) it will take longer

postgres@localhost:pitchfork> select * from artists_big where reviewid > 12345 and artist = 'Radiohead';
+----------+--------+
| reviewid | artist |
|----------+--------|
+----------+--------+
SELECT 0
Time: 1.205s (1 second), executed in: 1.200s (1 second)

As a result, we can index by both reviewid and artist

postgres@localhost:pitchfork> create index reviewid_artist_index on artists_big (reviewid, artist);
CREATE INDEX
Time: 31.841s (31 seconds), executed in: 31.841s (31 seconds)


postgres@localhost:pitchfork> select * from artists_big where reviewid > 12345 and artist = 'Radiohead';
+----------+--------+
| reviewid | artist |
|----------+--------|
+----------+--------+
SELECT 0
Time: 0.083s

as a result it's fast again. We can also drop the reviewid index only

drop index reviewid_index;

And it's still as fast for single query

select * from artists_big where reviewid = 12345;
The result was limited to 1000 rows
Time: 0.054s