HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Performance penalty for COUNT(1) OVER (PARTITION BY NULL)

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
partitionpenaltynullforperformancecountover

Problem

In my application server, I would like to paginate a dataset using LIMIT and OFFSET, and additionally return the total count of the dataset to the user.

Instead of making two remote calls to the database:

select count(1) as total_count from foo;
select c1 from foo;


I thought it would be smarter to do it in a single database call:

select c1, count(1) over (partition by null) from foo;


However, adding this window function results in an execution time that is an order of magnitude longer compared to not using the window function.

I find it surprising because the analogous select count(1) from foo takes only twice the amount of time as select c1 from foo. Yet converting this to a window function results in a degradation.

Moreover, using the following alternative using a subquery is very fast:

select c1, (select count(1) from foo) as total_count from foo;


I would have expected that postgresql would be able to optimize on the partition by null

I tried this in Oracle and found a similar performance penalty.

What explains why there is a performance penalty here? Would it be relatively easy, or even worthwhile, for the core postgresql devs to make a change to optimize this, e.g. by converting window functions that are PARTITION BY NULL into a subquery?

Setup:

drop table foo;
create table foo (c1 int);

insert into foo
select i from generate_series(0, 100000) i;

analyze foo;


Regular SELECT:

=> explain analyze select c1 from foo;
                                                QUERY PLAN
----------------------------------------------------------------------------------------------------------
 Seq Scan on foo  (cost=0.00..1443.01 rows=100001 width=4) (actual time=0.021..6.848 rows=100001 loops=1)
 Planning Time: 0.045 ms
 Execution Time: 10.021 ms


Not using the window function results in execution times around 10ms.

With a COUNT(1) OVER (PARTITION BY NULL) window function:

```
=> explain analyze select c1

Solution

Pagination?

First off, OFFSET / LIMIT are primitive tools for pagination, don't scale well, and don't work well under concurrent write load. Depending on your use case, there are much smarter solutions. See:

  • What is the recommended way to join junction tables for efficient ordering/pagination?



  • Optimize query with OFFSET on large table



Performance of total count

OVER (PARTITION BY NULL) is a pointless waste. Use the equivalent, faster OVER () instead.

count(1) is another (minor) pointless waste. Use count(*) instead.

Still, the observation is solid, I can reproduce it as expected.

However, it's still misleading. The test table is unrealistic, with just a single integer column. And typically you would add WHERE clauses and/or involve indexes. So your original test validity is limited.

A more realistic test case with (at least) some rudimentary payload columns and an index shows different insights:

CREATE TABLE foo (
  id int PRIMARY KEY
, payload text                    -- mininal payload of 32 byte text ...
, dt timestamptz DEFAULT now());  -- ... and a timestamp column

INSERT INTO foo (id, payload)
SELECT i, md5(i::text)::text
FROM   generate_series(1, 100000) i;


Consider the various tests in the fiddle (using Postgres 14; Postgres 13 is very similar, you can just switch the engine and re-run):

db<>fiddle here

count(*) OVER () is ~ 10 % faster than count(1) OVER (PARTITION BY NULL). Pay special attention to the more reliable test with warm cache ("repeat main test x with warm cache".

EXPLAIN even expects it to be faster estimating 3185 instead of 3435.

The subquery is still ~ 2-3x faster for the simple case without WHERE.

But adding a selective WHERE clause is a game changer. And that's the much more common use case.

And if the query does not have (perfect) index support, the picture changes again. Now, the subquery is substantially slower. Two sequential scans outweigh the overhead added by the window function.

Real-life examples are typically more messy, with table / index bloat, more columns, joins, computations, etc. Then a separate scan for the count in a subquery typically gets more expensive, yet.

Related:

  • Best way to get result count before LIMIT was applied

Code Snippets

CREATE TABLE foo (
  id int PRIMARY KEY
, payload text                    -- mininal payload of 32 byte text ...
, dt timestamptz DEFAULT now());  -- ... and a timestamp column

INSERT INTO foo (id, payload)
SELECT i, md5(i::text)::text
FROM   generate_series(1, 100000) i;

Context

StackExchange Database Administrators Q#308350, answer score: 5

Revisions (0)

No revisions yet.