patternsqlMinor
Postgresql query is very slow for my big table
Viewed 0 times
postgresqlqueryslowbigforverytable
Problem
My database version is postgresql 9.5.
I have a table that icludes incoming http request informations. The
So I have 72320081 rows in this table. And when I run the count query to get count of table,
The
This is very bad performance for me. I could not get reports from table from web applications because of performance.
My server hardware sources are:
I run the queries at nights, that there are no users on database, but slow. How can solve this problem?
create table if not exists request_log
(
id bigserial not null constraint app_requests_pkey primary key,
request_date timestamp not null,
ip varchar(50),
start_time timestamp,
application_name varchar(200),
request_path text,
display_url text,
username varchar(50)
);I have a table that icludes incoming http request informations. The
id column is primary key and index. The table has no relation.So I have 72320081 rows in this table. And when I run the count query to get count of table,
select count(id) from request_log; query takes 3-5 minutes.The
explain(analyze, buffers, format text) result for this request is:Aggregate (cost=3447214.71..3447214.72 rows=1 width=0) (actual time=135575.947..135575.947 rows=1 loops=1)
Buffers: shared hit=96 read=2551303
-> Seq Scan on request_log (cost=0.00..3268051.57 rows=71665257 width=0) (actual time=2.517..129032.408 rows=72320081 loops=1)
Buffers: shared hit=96 read=2551303
Planning time: 0.067 ms
Execution time: 135575.988 msThis is very bad performance for me. I could not get reports from table from web applications because of performance.
My server hardware sources are:
- OS: Linux ubuntu server 16, On Vmware
- 4 core cpu
- Mem 6Gb
- HDD 120 Gb
I run the queries at nights, that there are no users on database, but slow. How can solve this problem?
Solution
Counting rows is slow, because all rows of the table have to be visited.
Counting
There are a few options to speed things up:
-
Use a more recent version of PostgreSQL.
Then you can get parallel query, which will make the execution even more expensive, but faster.
-
Use the index on
Then you can get an index only scan.
-
Use an extra table with a counter that gets updated on every data modifying statement on the large table using a trigger.
Have a look at my blog post for an in-depth discussion.
Counting
id is even slower, because PostgreSQL first has to check if id is NULL or not (NULL values are not counted).There are a few options to speed things up:
-
Use a more recent version of PostgreSQL.
Then you can get parallel query, which will make the execution even more expensive, but faster.
-
Use the index on
id and keep the table well vacuumed.Then you can get an index only scan.
-
Use an extra table with a counter that gets updated on every data modifying statement on the large table using a trigger.
Have a look at my blog post for an in-depth discussion.
Context
StackExchange Database Administrators Q#255339, answer score: 5
Revisions (0)
No revisions yet.