gotchasqlModerate

Why does a Set Returning Function (SRF) run slower in a FROM clause?

Submitted by: @import:stackexchange-dba·Mar 10, 2026·

Viewed 0 times

whyslowerfunctionsrfreturningdoesfromclausesetrun

Problem

This is a database-internals question. I'm using PostgreSQL 9.5, I'm wondering why Set Returning Functions (SRFs), also known as Table Valued Functions (TVFs) run slower when in a FROM clause, for instance when I execute these commands,

CREATE TABLE foo AS SELECT * FROM generate_series(1,1e7);
SELECT 10000000
Time: 5573.574 ms

It's always substantially slower than,

CREATE TABLE foo AS SELECT generate_series(1,1e7);
SELECT 10000000
Time: 4622.567 ms

Is there a general rule that can be made here, such that we should always be running Set-Returning Functions outside of a FROM clause?

Solution

Let's start by comparing execution plans:

tinker=> EXPLAIN ANALYZE SELECT * FROM generate_series(1,1e7);
                                                           QUERY PLAN                                                           
--------------------------------------------------------------------------------------------------------------------------------
 Function Scan on generate_series  (cost=0.00..10.00 rows=1000 width=32) (actual time=2382.582..4291.136 rows=10000000 loops=1)
 Planning time: 0.022 ms
 Execution time: 5539.522 ms
(3 rows)

tinker=> EXPLAIN ANALYZE SELECT generate_series(1,1e7);
                                           QUERY PLAN                                            
-------------------------------------------------------------------------------------------------
 Result  (cost=0.00..5.01 rows=1000 width=0) (actual time=0.008..2622.365 rows=10000000 loops=1)
 Planning time: 0.045 ms
 Execution time: 3858.661 ms
(3 rows)

Okay, so now we know that SELECT * FROM generate_series() is executed using a Function Scan node, while SELECT generate_series() is executed using a Result node. Whatever is causing these queries to perform differently boils down to the difference between these two nodes, and we know exactly where to look.

One other interesting thing in the EXPLAIN ANALYZE output: note the timings. SELECT generate_series() is actual time=0.008..2622.365, while SELECT * FROM generate_series() is actual time=2382.582..4291.136. The Function Scan node starts returning records around the time the Result node finished returning records.

What was PostgreSQL doing between t=0 and t=2382 in the Function Scan plan? Apparently that's about how long it takes to run generate_series(), so I'd wager that's exactly what it was doing. The answer starts to take shape: it seems like Result returns results immediately, while it seems like Function Scan materializes the results and then scans them.

With the EXPLAIN out of the way, let's check the implementation. The Result node lives in nodeResult.c, which says:

* DESCRIPTION
 *
 *      Result nodes are used in queries where no relations are scanned.

The code is simple enough.

Function Scan lives in nodeFunctionScan.c, and indeed it appears to take a two phase execution strategy:

/*
 * If first time through, read all tuples from function and put them
 * in a tuplestore. Subsequent calls just fetch tuples from
 * tuplestore.
 */

And for clarity, let's see what a tuplestore is:

* tuplestore.h
 *    Generalized routines for temporary tuple storage.
 *
 * This module handles temporary storage of tuples for purposes such
 * as Materialize nodes, hashjoin batch files, etc.  It is essentially
 * a dumbed-down version of tuplesort.c; it does no sorting of tuples
 * but can only store and regurgitate a sequence of tuples.  However,
 * because no sort is required, it is allowed to start reading the sequence
 * before it has all been written.  This is particularly useful for cursors,
 * because it allows random access within the already-scanned portion of
 * a query without having to process the underlying scan to completion.
 * Also, it is possible to support multiple independent read pointers.
 *
 * A temporary file is used to handle the data if it exceeds the
 * space limit specified by the caller.

Hypothesis confirmed. Function Scan executes upfront, materializing the function's results, which for large result sets results in spilling to disk. Result doesn't materialize anything, but also supports only trivial operations.

Code Snippets

tinker=> EXPLAIN ANALYZE SELECT * FROM generate_series(1,1e7);
                                                           QUERY PLAN                                                           
--------------------------------------------------------------------------------------------------------------------------------
 Function Scan on generate_series  (cost=0.00..10.00 rows=1000 width=32) (actual time=2382.582..4291.136 rows=10000000 loops=1)
 Planning time: 0.022 ms
 Execution time: 5539.522 ms
(3 rows)

tinker=> EXPLAIN ANALYZE SELECT generate_series(1,1e7);
                                           QUERY PLAN                                            
-------------------------------------------------------------------------------------------------
 Result  (cost=0.00..5.01 rows=1000 width=0) (actual time=0.008..2622.365 rows=10000000 loops=1)
 Planning time: 0.045 ms
 Execution time: 3858.661 ms
(3 rows)

* DESCRIPTION
 *
 *      Result nodes are used in queries where no relations are scanned.

/*
 * If first time through, read all tuples from function and put them
 * in a tuplestore. Subsequent calls just fetch tuples from
 * tuplestore.
 */

* tuplestore.h
 *    Generalized routines for temporary tuple storage.
 *
 * This module handles temporary storage of tuples for purposes such
 * as Materialize nodes, hashjoin batch files, etc.  It is essentially
 * a dumbed-down version of tuplesort.c; it does no sorting of tuples
 * but can only store and regurgitate a sequence of tuples.  However,
 * because no sort is required, it is allowed to start reading the sequence
 * before it has all been written.  This is particularly useful for cursors,
 * because it allows random access within the already-scanned portion of
 * a query without having to process the underlying scan to completion.
 * Also, it is possible to support multiple independent read pointers.
 *
 * A temporary file is used to handle the data if it exceeds the
 * space limit specified by the caller.

Context

StackExchange Database Administrators Q#201576, answer score: 13

Revisions (0)

No revisions yet.