HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

PostgreSQL does not use all available IOPS on a single operation

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqlavailablealloperationiopssingledoesusenot

Problem

Using PostgreSQL 9.3 on Windows 7x64, CoreI7, SSD.

Assume a PostgreSQL table with 8 Int32 columns and following C# code to fill the table.

using (var connection = new NpgsqlConnection("User Id=postgres;Password=**;host=localhost;database=**;"))
using (var command = new NpgsqlCommand())
{
    connection.Open();
    command.Connection = connection;
    command.CommandText = "COPY \"MyTable\" FROM STDIN;";
    var copyInSerializer = new NpgsqlCopySerializer(connection);
    var copyIn = new NpgsqlCopyIn(command, connection, copyInSerializer.ToStream);
    copyIn.Start();

    for (int i = start; i < start + 1000000; i++)
    {
        copyInSerializer.AddInt32(i);
        //add 7 other int32 values
        copyInSerializer.EndRow();
    }
    copyInSerializer.Flush();
    copyIn.End();

    connection.Close();
}


On my test hardware this took ~10 seconds execution time and measured disk usage of ~20MB/s.

CPU and RAM do NOT hit limits.

The given SSD supports ~60MB/s writing random 4k blocks so I wondered why it would not execute faster. I created a second, identical table in the same database and disk and ran the above code sample in parallel (two tasks) on both tables.

Both took ~10 seconds, and Measured Disk usage was ~40MB/s.

Why does PostgreSQL not focus all available resources on the one and only operation happening?

Is there a way to force PostgreSQL to use the full available disk bandwidth?

Solution

ADO.NET (used in the code sample) obviously communicates via network - even if a local Postgres instance is used. Using STDIN therefore results in quite a bit overhead. Before Postgres is able to insert the bulk data, it needs to be:

  • serialized by C# client



  • transmitted via network connection (in my case to localhost)



  • deserialized by Postgres



This basically explains/justifies the described behaviour.

If the C# Client and Postgres are run on the same computer, you can speed up bulk inseration by writing a csv file on a local ssd drive (even better a ram drive) and execute the SQL statement 'Copy MyTable From x:\test.csv'.

Context

StackExchange Database Administrators Q#57506, answer score: 2

Revisions (0)

No revisions yet.