patternsqlMinor
Get count estimates from pg_class.reltuples for given conditions
Viewed 0 times
estimatesreltuplesgetforconditionscountfromgivenpg_class
Problem
Is it possible to query the
Currently on my bigger tables the
reltuples column for a given table with additional conditions like table.name LIKE 'hello%'?Currently on my bigger tables the
SELECT count(*) query takes long and I wouldn't need an exact count. So I wanted to know if it's possible to add WHERE clauses to the reltuples as well?Solution
Not out of the box. But you can achieve it with a ...
Partial index
The actual index column (
Every index has its own entry in
Number of rows in the table. This is only an
estimate used by the planner. It is updated by
few DDL commands such as
Hence, a partial index can be (ab-)used to get count estimates for any set of predicates, updated by
This might actually be a very clever idea. But you have to weigh cost and benefits: the partial index is small for a rare condition, but the cost gets bigger for a common condition. And while keeping the statistics up to date is comparatively cheap, it's not free of cost.
Related:
You will love the new feature in the upcoming Postgres 9.5, which only looks at a random sample of n % of blocks in the table to get a quick estimate. Example for 1 %:
Details in the answer already linked above:
Alternative for given example
For the given example
See:
Partial index
CREATE INDEX tbl_name_hello_idx ON tbl(tbl_id) WHERE name LIKE 'hello%';
SELECT reltuples
FROM pg_class
WHERE oid = 'tbl_name_hello_idx'::regclass; -- or schema-qualify table nameThe actual index column (
tbl_id in the example) is irrelevant (unless you have additional use for the index). Best pick a small column that is never changed: a serial PK column would be a perfect candidate. Or you could use a constant (which defeats additional purposes of that index):CREATE INDEX tbl_name_hello_idx ON tbl((1)) WHERE name LIKE 'hello%';Every index has its own entry in
pg_class and its own reltuples count. The documentation:reltuples ...Number of rows in the table. This is only an
estimate used by the planner. It is updated by
VACUUM, ANALYZE, and afew DDL commands such as
CREATE INDEX.Hence, a partial index can be (ab-)used to get count estimates for any set of predicates, updated by
autovacuum automatically. Or maybe you already have the index because you need it anyway?This might actually be a very clever idea. But you have to weigh cost and benefits: the partial index is small for a rare condition, but the cost gets bigger for a common condition. And while keeping the statistics up to date is comparatively cheap, it's not free of cost.
Related:
- Index that is not used, yet influences query
- Fast way to discover the row count of a table
TABLESAMPLE SYSTEM (n) in Postgres 9.5+You will love the new feature in the upcoming Postgres 9.5, which only looks at a random sample of n % of blocks in the table to get a quick estimate. Example for 1 %:
SELECT 100 * count(*) AS estimate
FROM tbl TABLESAMPLE SYSTEM (1)
WHERE name LIKE 'hello%';Details in the answer already linked above:
- Fast way to discover the row count of a table
Alternative for given example
For the given example
name LIKE 'hello%' you could get very fast exact results using the right index anyway:CREATE INDEX tbl_name_text_pattern_idx ON tbl(name text_pattern_ops);See:
- How can one optimize a query involving a sort and a LIKE?
- Pattern matching with LIKE, SIMILAR TO or regular expressions
Code Snippets
CREATE INDEX tbl_name_hello_idx ON tbl(tbl_id) WHERE name LIKE 'hello%';
SELECT reltuples
FROM pg_class
WHERE oid = 'tbl_name_hello_idx'::regclass; -- or schema-qualify table nameCREATE INDEX tbl_name_hello_idx ON tbl((1)) WHERE name LIKE 'hello%';SELECT 100 * count(*) AS estimate
FROM tbl TABLESAMPLE SYSTEM (1)
WHERE name LIKE 'hello%';CREATE INDEX tbl_name_text_pattern_idx ON tbl(name text_pattern_ops);Context
StackExchange Database Administrators Q#115873, answer score: 9
Revisions (0)
No revisions yet.