patternsqlMinor
Index scan on UNIQUE index for count(*)
Viewed 0 times
indexuniquescanforcount
Problem
I've got a table
When I do
I would have expected a fast
If I do
Why is the unique index ignored in this case? What's the catch?
I am using PostgreSQL 10.4
For a clean room test I did the following:
```
create table tmp
(
row_id varchar(15) unique not null,
);
insert into tmp (row_id, ) select row_id, from t;
commit;
analyze tmp;
set enable_seqscan = on;
explain (analyze, buffers) select count(*) from tmp;
QUERY PLAN
Aggregate (cost=744070.45..744070.46 rows=1 width=8) (actual time=5631.501..5631.502 rows=1 loops=1)
Buffers: shared hit=209109 read=245254
-> Seq Scan on tmp (cost=0.00..686128.96 rows=23176596 width=0) (actual time=0.014..3481.967 rows=23176906 loops=1)
Buffers: shared hit=209109 read=245254
Planning time: 0.064 ms
Execution time: 5631.531 ms
SET enable_seqscan = off;
explain (analyze, buffers) select count(*) from tmp;
QUERY PLAN
Aggregate (cost=980282.14..980282.15 rows=1 width=8) (actual time=16224.408..16224.408 rows=1 loops=1)
Buffers: shared hit=26285 read=542015
-> Bitmap Heap Scan on tmp (cost=236211.69..922340.65 rows=23176596 width=0) (actual time=10030.115..14157.288 rows=23176906 loops=1)
Heap Blocks: exact=454363
Buffers: shared hit=26285 read=542015
-> Bitmap Index Scan on tmp_row_id_key (cost=0.00..230417.54 rows=23176596 width=0) (actual time=9929.5
t with ~23 million rows (4248 MB in size). There's a column row_id in it, with a not null constraint. And a unique index p1 on t(row_id). When I do
select count(*) from t to count all the rows in the table, the planner tells me: Seq Scan on t (cost=0.00..686191.06 rows=23176906 width=0)I would have expected a fast
Index Only Scan (index p1 occupies only 698 MB - 6x less). If I do
SET enable_seqscan = off, then the planner still insists on reading the table rows:QUERY PLAN
-> Bitmap Heap Scan on t (cost=210923.32..897114.38 rows=23176906 width=0)
-> Bitmap Index Scan on p1 (cost=0.00..205129.09 rows=23176906 width=0)Why is the unique index ignored in this case? What's the catch?
I am using PostgreSQL 10.4
For a clean room test I did the following:
```
create table tmp
(
row_id varchar(15) unique not null,
);
insert into tmp (row_id, ) select row_id, from t;
commit;
analyze tmp;
set enable_seqscan = on;
explain (analyze, buffers) select count(*) from tmp;
QUERY PLAN
Aggregate (cost=744070.45..744070.46 rows=1 width=8) (actual time=5631.501..5631.502 rows=1 loops=1)
Buffers: shared hit=209109 read=245254
-> Seq Scan on tmp (cost=0.00..686128.96 rows=23176596 width=0) (actual time=0.014..3481.967 rows=23176906 loops=1)
Buffers: shared hit=209109 read=245254
Planning time: 0.064 ms
Execution time: 5631.531 ms
SET enable_seqscan = off;
explain (analyze, buffers) select count(*) from tmp;
QUERY PLAN
Aggregate (cost=980282.14..980282.15 rows=1 width=8) (actual time=16224.408..16224.408 rows=1 loops=1)
Buffers: shared hit=26285 read=542015
-> Bitmap Heap Scan on tmp (cost=236211.69..922340.65 rows=23176596 width=0) (actual time=10030.115..14157.288 rows=23176906 loops=1)
Heap Blocks: exact=454363
Buffers: shared hit=26285 read=542015
-> Bitmap Index Scan on tmp_row_id_key (cost=0.00..230417.54 rows=23176596 width=0) (actual time=9929.5
Solution
As to why you do not get an index scan with
Sample Data
With
Notice we're doing a "Parallel Seq Scan"
Without
Notice we're doing a "Parallel Index Only Scan"
SET enable_seqscan = off, you should be getting an index-only scan. Your situation can not yet be recreated with the data you've provided. This certainly works with PostgreSQL 10.4. I can't speak to your own use case, and there are a lot of reasons why you may not get an index scan in the real world. Ultimately, debugging a question along those lines will result in an answer that is simply "planner estimates" but will require a lot more data about your environment, configuration, and the plans with and without SET enable_seqscan = off.Sample Data
BEGIN;
CREATE TABLE foo ( x int NOT NULL UNIQUE );
INSERT INTO foo (x) SELECT generate_series(1,1e6);
COMMIT;
ANALYZE foo; -- don't forget to analyzeWith
seq_scantest=# EXPLAIN SELECT count(*) FROM foo;
QUERY PLAN
--------------------------------------------------------------------------------------
Finalize Aggregate (cost=10633.55..10633.56 rows=1 width=8)
-> Gather (cost=10633.33..10633.54 rows=2 width=8)
Workers Planned: 2
-> Partial Aggregate (cost=9633.33..9633.34 rows=1 width=8)
-> Parallel Seq Scan on foo (cost=0.00..8591.67 rows=416667 width=0)
(5 rows)Notice we're doing a "Parallel Seq Scan"
Without
seq_scanSET enable_seq_scan = off;
test=# EXPLAIN SELECT count(*) FROM foo;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=26616.97..26616.98 rows=1 width=8)
-> Gather (cost=26616.76..26616.97 rows=2 width=8)
Workers Planned: 2
-> Partial Aggregate (cost=25616.76..25616.77 rows=1 width=8)
-> Parallel Index Only Scan using foo_x_key on foo (cost=0.42..24575.09 rows=416667 width=0)
(5 rows)Notice we're doing a "Parallel Index Only Scan"
Code Snippets
BEGIN;
CREATE TABLE foo ( x int NOT NULL UNIQUE );
INSERT INTO foo (x) SELECT generate_series(1,1e6);
COMMIT;
ANALYZE foo; -- don't forget to analyzetest=# EXPLAIN SELECT count(*) FROM foo;
QUERY PLAN
--------------------------------------------------------------------------------------
Finalize Aggregate (cost=10633.55..10633.56 rows=1 width=8)
-> Gather (cost=10633.33..10633.54 rows=2 width=8)
Workers Planned: 2
-> Partial Aggregate (cost=9633.33..9633.34 rows=1 width=8)
-> Parallel Seq Scan on foo (cost=0.00..8591.67 rows=416667 width=0)
(5 rows)SET enable_seq_scan = off;
test=# EXPLAIN SELECT count(*) FROM foo;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=26616.97..26616.98 rows=1 width=8)
-> Gather (cost=26616.76..26616.97 rows=2 width=8)
Workers Planned: 2
-> Partial Aggregate (cost=25616.76..25616.77 rows=1 width=8)
-> Parallel Index Only Scan using foo_x_key on foo (cost=0.42..24575.09 rows=416667 width=0)
(5 rows)Context
StackExchange Database Administrators Q#209331, answer score: 3
Revisions (0)
No revisions yet.