HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

get row count of last select query

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
lastqueryselectgetcountrow

Problem

I am using a scheduled script which exports the (new) rows of the PostgreSQL database into a textfile. There are also several instances of this script running for several DBs:

COPY (SELECT ... FROM ... GROUP BY zzz) TO STDOUT >file.csv


In order to have a consistency check of the file.csv, I then run a query to get the number of lines which the COPY query had:

SELECT count(0) FROM ... GROUP BY zzz >linecount


In the bash script there is a check if both are equal, if that's the case the script proceeds, if not it stops and reports an error (which happens rarely, but it does - and i haven't figured out why yet)

The problem, the query result is rather large (takes around 5-10min) and it is basically running twice (although the count(0) takes shorter, but still 3-7min)

Is there a PostgreSQL function to get the number of rows the last query had? If that would be the case, I could cut down DB load quite a load and shorten the export time.

I have looked into ROW_COUNT, but it only seems to be valid for UPDATE/INSERT and not SELECT, I'd also like to avoid using functions for simplicity reasons.

I am open to alternative suggestions, eg first query into a temp table, then export from there (maybe count(0) is faster then)?

Debian 7 Wheezy, SSD, 32GB Ram, PostgreSQL 9.1

UPDATE @Craig Ringer

I agree that the consistency check is somewhat 'flawed' by design. I noticed once that the import-check had a problem i couldnt explain, I then ran the export again, and the file had different lines than the one before:

Rowcount DB/LinecountF:532395/532014

rowcount: SELECT count(0)... as described originally
linecount  = wc -l


They are running as cronjobs and I havent figured out why those lines were missing. Running the same script second time and everything was fine. It happens very rarely, but the logfile didnt say anything (or mail result from cron).

I didnt check the exit status of the PSQL tough (as I built in my own checks). I will do t

Solution

If you're willing to forgo writing to STDOUT, you can get the number of rows that COPY exports with a PL/pgSQL function along the lines of:

CREATE OR REPLACE FUNCTION copyout(query text, output_path text) RETURNS integer AS $fn$
DECLARE
    result integer;
BEGIN
    EXECUTE 'COPY (' || query || $) TO '$ || output_path || $'$;
    GET DIAGNOSTICS result = ROW_COUNT;
    RETURN result;
END
$fn$ LANGUAGE plpgsql;


Execute with something like this (note the path is on the server not the client):

> SELECT copyout('SELECT * FROM mytable', '/tmp/myfile.csv');
 copyout
---------
     981
(1 row)


However - just found out this was added in 9.3 (see 9.3 Release Notes). Sorry.

Code Snippets

CREATE OR REPLACE FUNCTION copyout(query text, output_path text) RETURNS integer AS $fn$
DECLARE
    result integer;
BEGIN
    EXECUTE 'COPY (' || query || $$) TO '$$ || output_path || $$'$$;
    GET DIAGNOSTICS result = ROW_COUNT;
    RETURN result;
END
$fn$ LANGUAGE plpgsql;
> SELECT copyout('SELECT * FROM mytable', '/tmp/myfile.csv');
 copyout
---------
     981
(1 row)

Context

StackExchange Database Administrators Q#76365, answer score: 7

Revisions (0)

No revisions yet.