patternsqlMinor
Optimize simple query using ORDER BY date and text
Viewed 0 times
simpleorderquerytextdateusingoptimizeand
Problem
I have a query to a table in Postgres with an order based on a date field and a number field, this table has 1000000 records
The data types of the table are:
The query is:
This query takes about 5 seconds, but if I take out the "order by" the query takes only 0.499 seconds
The problem I have is that I need to run this query in the shortest time possible,
so I search on google what can I do and create a composite index with the following query
But the query is taking the same time or even more.
I'm using Postgres 9.0.13, here is the EXPLAIN with 73436 rows
Postgres is running on a Phenon II 1055T (3 cores) With 8 GB Ram and 500 GB disk.
How I can optimize this query?
The data types of the table are:
fcv_id = serial
fcv_fecha_comprobante = timestamp without time zone
fcv_numero_comprobante = varchar(60)The query is:
SELECT fcv_id, fcv_fecha_comprobante FROM factura_venta
ORDER BY fcv_fecha_comprobante, fcv_numero_comprobanteThis query takes about 5 seconds, but if I take out the "order by" the query takes only 0.499 seconds
The problem I have is that I need to run this query in the shortest time possible,
so I search on google what can I do and create a composite index with the following query
CREATE INDEX factura_venta_orden ON factura_venta
USING btree (fcv_fecha_comprobante ASC NULLS LAST
, fcv_numero_comprobante ASC NULLS LAST);
ALTER TABLE factura_venta CLUSTER ON factura_venta_orden;But the query is taking the same time or even more.
I'm using Postgres 9.0.13, here is the EXPLAIN with 73436 rows
Sort (cost=11714.03..11897.62 rows=73436 width=27) (actual time=1260.759..1579.853 rows=73436 loops=1)
Sort Key: fcv_fecha_comprobante, fcv_numero_comprobante
Sort Method: external merge Disk: 2928kB
-> Seq Scan on factura_venta (cost=0.00..4018.36 rows=73436 width=27) (actual time=0.363..162.558 rows=73436 loops=1)
Total runtime: 1694.882 msPostgres is running on a Phenon II 1055T (3 cores) With 8 GB Ram and 500 GB disk.
How I can optimize this query?
Solution
According to your comment on a deleted post, you load all rows into a java module to conduct a search there. But searching is better done in the database itself - that's what a database is good at. Only return the rows you actually need.
If you really need all rows, there are many little things to make this faster. 1M rows will never be very fast, though.
Postgres 9.2 or later
You can make the index covering by appending
This way, provided the table isn't updated too much, Postgres can retrieve results with an index-only scan.
The additional column comes last since it does not contribute to the sort order. Explanation:
In Postgres 11 or later you could make that:
I see you already found
There is also the community tool
This line in your
tells us, that sorting is not done in RAM, which is expensive. You could probably improve performance by tuning the according setting for
Specifies the amount of memory to be used by internal sort operations and hash tables before writing to temporary disk files. ...
Setting this too high may have adverse effects. Read the manual carefully. Consider increasing the setting only for the transaction with the big query:
50 MB are an estimate based on your
If you really need all rows, there are many little things to make this faster. 1M rows will never be very fast, though.
Postgres 9.2 or later
You can make the index covering by appending
fcv_id:CREATE INDEX factura_venta_orden
ON factura_venta (fcv_fecha_comprobante, fcv_numero_comprobante, fcv_id);This way, provided the table isn't updated too much, Postgres can retrieve results with an index-only scan.
The additional column comes last since it does not contribute to the sort order. Explanation:
- Is a composite index also good for queries on the first field?
In Postgres 11 or later you could make that:
CREATE INDEX factura_venta_orden
ON factura_venta (fcv_fecha_comprobante, fcv_numero_comprobante) INCLUDE (fcv_id);CLUSTER / pg_repackI see you already found
CLUSTER. You are aware that this is a one-time operation, that should help your cause, but needs to be re-run after enough updates?There is also the community tool
pg_repack as replacement for VACUUM FULL / CLUSTER.work_memThis line in your
EXPLAIN output:Sort Method: external merge Disk: 2928kBtells us, that sorting is not done in RAM, which is expensive. You could probably improve performance by tuning the according setting for
work_memwork_mem (integer)Specifies the amount of memory to be used by internal sort operations and hash tables before writing to temporary disk files. ...
Setting this too high may have adverse effects. Read the manual carefully. Consider increasing the setting only for the transaction with the big query:
BEGIN;
SET LOCAL work_mem = '50MB';
SELECT ...;
COMMIT;50 MB are an estimate based on your
EXPLAIN ANALYZE output for 73k rows. Test with 1M rows to get the actual amount you need.Code Snippets
CREATE INDEX factura_venta_orden
ON factura_venta (fcv_fecha_comprobante, fcv_numero_comprobante, fcv_id);CREATE INDEX factura_venta_orden
ON factura_venta (fcv_fecha_comprobante, fcv_numero_comprobante) INCLUDE (fcv_id);Sort Method: external merge Disk: 2928kBBEGIN;
SET LOCAL work_mem = '50MB';
SELECT ...;
COMMIT;Context
StackExchange Database Administrators Q#48630, answer score: 8
Revisions (0)
No revisions yet.