PostgreSQL SSD performance benchmark

how much faster can it be

14. November 2018

The big difference between SSD and HDD is rotation.
SSD has no rotation, so there is no need to wait for the spindle to return from it's turn around.

To use this performance feature of SSD storage, PostgreSQL has some nice configuration parameters in it's postgresql.conf file.

I have tested these parameters and made some benchmarks with both tools

On regular disks PostgreSQL has one single process to create an index.
On SSD it makes sense to let this index be created by multiple workers.
This also makes sense on e.g. RAID level 10.

pg_restore

In my test scenario pg_restore has to restore a 1.3 GB database dump in a directory. So it can use it's support for multi threading. Uncompressed database size is about 30 GB including some very big indexes.

The test is run on a notebook computer with

First try is about 35 minutes

Without "effective_io_concurrency" pg_restore piped into psql takes about 35 minutes. Bottleneck pipe and bottleneck "spindle".
time pg_restore -j 4 nuclosdb_20181114 | psql -h localhost -U nuclos nuclosdb

SSD never came to it's I/O limit.
CPU was most of the time at 100 % for only 1 process.

Second try about 30 minutes

PostgreSQL has enabled it's SSD parameters, so only one bottleneck is left (the pipe):
time pg_restore

nuclosdb_20181114 | psql -h localhost -U nuclos nuclosdb

SSL only somtimes came to it's I/O limit - when parallel reading/writing occurred.
CPU was sometimes at 100 % for psql process.

Third try again about 30 minutes, PostgreSQL eats SSD

Using pg_restore's multi threaded direct import:
time pg_restore -j 4 -h localhost -d nuclosdb -U nuclos nuclosdb_20181114/

SSD was quite always on it's I/O limit.
No other bottleneck.

Fourth try with less threads for pg_restore, again about 30 minutes

The import could have caused a race condition with the database, because pg_restore runs on the same machine as PostgreSQL server.
So I've reduce the number of CPU cores from 4 to 2:
time pg_restore -j 2 -h localhost -d nuclosdb -U nuclos nuclosdb_20181114/

The configuration parameters

effective_io_concurrency = 128

default value is 1 - one hard disk, one spindle, one possible concurrent io operation

max_worker_processes = 128

max_parallel_workers_per_gather = 16

max_parallel_workers = 8

any SQL statement can use as many workers as given here.
should be limited to CPU count to reduce the risk of race condition.

autovacuum = off

during restore autovacuum starts analyzing new tables. this brakes import speed and can also be done later. so temporarily disable autovacuum for better performance.

Return

no analytics