patternsqlMinor
Postgres Query Optimization, high cost functions
Viewed 0 times
functionspostgresqueryhighoptimizationcost
Problem
Currently I'm working with a postgres table that looks like this (postgres12)
The table
The
Now the problem:
The following query takes way too long.
The resulting queryplan looks like this:
`Merge Left Join (cost=234610.64..234774.05 rows=494 width=1740) (actual time=9173.660..9176.986 rows=50 loops=1)
Merge Cond:
create table if not exists asset (
id text,
symbol text not null,
name text not null
primary key (id)
);
create table if not exists latest_value (
timestamp bigint,
asset text,
price decimal null,
market_cap decimal null,
primary key (asset),
foreign key (asset)
references asset (id)
on delete cascade
);
create table if not exists value_aggregation (
context aggregation_context,
timestamp bigint,
asset text,
price jsonb null,
market_cap jsonb null,
primary key (context, timestamp, asset),
foreign key (asset)
references asset (id)
on delete cascade
) partition by list (context);
create table if not exists value_aggregation_hour
partition of value_aggregation
for values in ('hour');
create index if not exists value_aggregation_timestamp_index
on value_aggregation using brin(timestamp)
with (autosummarize=true);
The table
value_aggregation_hour has approximately 2 million rows.The
price column consists of a jsonb with attributes like open, close, avgNow the problem:
The following query takes way too long.
WITH base_table AS
(SELECT asset, timestamp, market_cap, price
FROM latest_value
ORDER BY market_cap DESC
LIMIT 50
OFFSET 0)
SELECT asset.name, asset.symbol, asset.id, asset.market_data, asset.meta_data, timestamp, market_cap, price, spark.sparkline
FROM base_table LEFT JOIN (
SELECT asset, array_agg(CAST(price->>'open' AS decimal) ORDER BY timestamp ASC) AS sparkline
FROM value_aggregation
WHERE context = 'hour'
AND timestamp > extract(epoch from (now() - INTERVAL '7d'))
AND asset IN (
SELECT asset
FROM base_table)
GROUP BY asset
) spark ON base_table.asset = spark.asset
INNER JOIN asset ON base_table.asset = asset.id;
The resulting queryplan looks like this:
`Merge Left Join (cost=234610.64..234774.05 rows=494 width=1740) (actual time=9173.660..9176.986 rows=50 loops=1)
Merge Cond:
Solution
PostgreSQL estimates the sequential scan on
The row count estimate is very good, so the problem is probably that PostgreSQL has a wrong idea about your machine. You could try to improve that:
-
set
Higher values lower the estimated. cost of index scans.
-
set
Lower values lower the estimated. cost of index scans.
value_aggregation_hour slightly cheaper than the index scan (233000 vs. 236000), while in reality it is much cheaper.The row count estimate is very good, so the problem is probably that PostgreSQL has a wrong idea about your machine. You could try to improve that:
-
set
effective_cache_size to the amount of memory available for caching data (shared_buffers + file system cache).Higher values lower the estimated. cost of index scans.
-
set
random_page_cost to a lower value. If random access is as fast as sequential access on your storage system, use a value of 1.Lower values lower the estimated. cost of index scans.
Context
StackExchange Database Administrators Q#274461, answer score: 6
Revisions (0)
No revisions yet.