HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Get count estimates from pg_class.reltuples for given conditions

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
estimatesreltuplesgetforconditionscountfromgivenpg_class

Problem

Is it possible to query the reltuples column for a given table with additional conditions like table.name LIKE 'hello%'?

Currently on my bigger tables the SELECT count(*) query takes long and I wouldn't need an exact count. So I wanted to know if it's possible to add WHERE clauses to the reltuples as well?

Solution

Not out of the box. But you can achieve it with a ...
Partial index

CREATE INDEX tbl_name_hello_idx ON tbl(tbl_id) WHERE name LIKE 'hello%';

SELECT reltuples
FROM   pg_class
WHERE  oid = 'tbl_name_hello_idx'::regclass;  -- or schema-qualify table name


The actual index column (tbl_id in the example) is irrelevant (unless you have additional use for the index). Best pick a small column that is never changed: a serial PK column would be a perfect candidate. Or you could use a constant (which defeats additional purposes of that index):

CREATE INDEX tbl_name_hello_idx ON tbl((1)) WHERE name LIKE 'hello%';


Every index has its own entry in pg_class and its own reltuples count. The documentation:

reltuples ...

Number of rows in the table. This is only an
estimate used by the planner. It is updated by VACUUM, ANALYZE, and a
few DDL commands such as CREATE INDEX.

Hence, a partial index can be (ab-)used to get count estimates for any set of predicates, updated by autovacuum automatically. Or maybe you already have the index because you need it anyway?

This might actually be a very clever idea. But you have to weigh cost and benefits: the partial index is small for a rare condition, but the cost gets bigger for a common condition. And while keeping the statistics up to date is comparatively cheap, it's not free of cost.

Related:

  • Index that is not used, yet influences query



  • Fast way to discover the row count of a table



TABLESAMPLE SYSTEM (n) in Postgres 9.5+

You will love the new feature in the upcoming Postgres 9.5, which only looks at a random sample of n % of blocks in the table to get a quick estimate. Example for 1 %:

SELECT 100 * count(*) AS estimate
FROM   tbl TABLESAMPLE SYSTEM (1)
WHERE  name LIKE 'hello%';


Details in the answer already linked above:

  • Fast way to discover the row count of a table



Alternative for given example

For the given example name LIKE 'hello%' you could get very fast exact results using the right index anyway:

CREATE INDEX tbl_name_text_pattern_idx ON tbl(name text_pattern_ops);


See:

  • How can one optimize a query involving a sort and a LIKE?



  • Pattern matching with LIKE, SIMILAR TO or regular expressions

Code Snippets

CREATE INDEX tbl_name_hello_idx ON tbl(tbl_id) WHERE name LIKE 'hello%';

SELECT reltuples
FROM   pg_class
WHERE  oid = 'tbl_name_hello_idx'::regclass;  -- or schema-qualify table name
CREATE INDEX tbl_name_hello_idx ON tbl((1)) WHERE name LIKE 'hello%';
SELECT 100 * count(*) AS estimate
FROM   tbl TABLESAMPLE SYSTEM (1)
WHERE  name LIKE 'hello%';
CREATE INDEX tbl_name_text_pattern_idx ON tbl(name text_pattern_ops);

Context

StackExchange Database Administrators Q#115873, answer score: 9

Revisions (0)

No revisions yet.