HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Why is my nested loop taking so much time?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
whymuchlooptimenestedtaking

Problem

I have the following request. In a nutshell, the wallets_history represents the transactions that happen in the wallet. And the goal is to get 1 point every in_interval from in_from to in_to.

They will eventually be displayed as a histogram to the user. (I simplified the request to make it easier to read, this doesn't impact the issue.)

CREATE OR REPLACE FUNCTION get_wallets_histogram(
  in_wallet_id TEXT, 
  in_code TEXT, 
  in_asset_id TEXT, 
  in_from TIMESTAMPTZ, 
  in_to TIMESTAMPTZ, 
  in_interval INTERVAL)
RETURNS TABLE(date TIMESTAMPTZ, balance_change BIGINT) AS $
  SELECT s AS time,
        change::bigint
  FROM generate_series(in_from,
                       in_to - in_interval,
                       in_interval) s
  LEFT JOIN LATERAL (
      SELECT s, COALESCE(SUM(balance_change),0) change FROM wallets_history
      WHERE  s <= seen_at AND seen_at < s + in_interval
  ) q USING (s)
$ LANGUAGE SQL STABLE;


Then I run this: (Note that the interval is 16 days, so there is only a single row in the lateral request for simplified example)

SELECT * FROM get_wallets_histogram(
  'WHALE', 
  'BTC',  
  '',  
  '2022-01-01'::timestamptz,  
  '2022-02-01'::timestamptz,  
  interval '16 day');


The query plan is

```
Nested Loop Left Join (cost=2526.62..2526662.88 rows=1000 width=16) (actual time=109.301..109.303 rows=1 loops=1)
Output: s.s, (q.change)::bigint
Inner Unique: true
-> Function Scan on pg_catalog.generate_series s (cost=0.01..10.01 rows=1000 width=8) (actual time=104.854..104.855 rows=1 loops=1)
Output: s.s
Function Call: generate_series('2022-01-01 00:00:00+00'::timestamp with time zone, ('2022-02-01 00:00:00+00'::timestamp with time zone - '16 days'::interval), '16 days'::interval)
-> Subquery Scan on q (cost=2526.62..2526.64 rows=1 width=40) (actual time=4.442..4.443 rows=1 loops=1)
Output: q.s, q.change
Filter: (s.s = q.s)
-> Aggregate (cost=2526.62..2526.63

Solution

Summary of the fix

I found out that the jit was taking most of the time in the query, then I tried to track down why JIT was needed for my simple query.

It turns out that generate_series for timestamps confuse the planner: the planner always estimates it returns 1000 rows.
So I created a generate_series_fixed below which add a LIMIT to generate_series, then use this as a drop in replacement instead.

The rest of this answer is the history of my findings.
Here is the drop in replacement if you need it.

-- Technically this LIMIT clause is useless. However, without it the query planner
-- is unable to correctly estimate the numbers of row in generate_series
-- which cause JIT compilation, slowing down the query considerably
-- See https://dba.stackexchange.com/questions/310235/why-is-my-nested-loop-taking-so-much-time/310242#310242
CREATE OR REPLACE FUNCTION generate_series_fixed(in_from TIMESTAMPTZ, in_to TIMESTAMPTZ, in_interval INTERVAL) RETURNS TABLE(s TIMESTAMPTZ) AS $
  SELECT generate_series(in_from, in_to, in_interval)
  LIMIT  (EXTRACT(EPOCH FROM (in_to - in_from))/EXTRACT(EPOCH FROM in_interval)) + 1; -- I am unsure about the exact formula, but over estimating 1 row is fine...
$ LANGUAGE SQL STABLE;


So it seems the culprit was... the JIT!
Using set jit = off completely solves the issue.

Plan with the JIT on

```
WindowAgg (cost=2543486.92..2543511.92 rows=1000 width=24) (actual time=305.248..305.252 rows=1 loops=1)
Output: s.s, (q.change)::bigint, ((sum(q.change) OVER (?) + COALESCE($0, '0'::numeric)))::bigint
InitPlan 1 (returns $0)
-> Limit (cost=8414.21..8414.21 rows=1 width=23) (actual time=29.647..29.648 rows=1 loops=1)
Output: wallets_history_1.balance_total, wallets_history_1.seen_at, wallets_history_1.blk_height, wallets_history_1.blk_idx
-> Sort (cost=8414.21..8662.97 rows=99505 width=23) (actual time=29.646..29.646 rows=1 loops=1)
Output: wallets_history_1.balance_total, wallets_history_1.seen_at, wallets_history_1.blk_height, wallets_history_1.blk_idx
Sort Key: wallets_history_1.seen_at DESC, wallets_history_1.blk_height DESC, wallets_history_1.blk_idx DESC
Sort Method: top-N heapsort Memory: 25kB
-> Bitmap Heap Scan on public.wallets_history wallets_history_1 (cost=1871.58..7916.68 rows=99505 width=23) (actual time=3.425..19.254 rows=98750 loops=1)
Output: wallets_history_1.balance_total, wallets_history_1.seen_at, wallets_history_1.blk_height, wallets_history_1.blk_idx
Recheck Cond: (wallets_history_1.seen_at Bitmap Index Scan on wallets_history_by_seen_at (cost=0.00..1846.71 rows=99505 width=0) (actual time=3.272..3.272 rows=98750 loops=1)
Index Cond: (wallets_history_1.seen_at Sort (cost=2535072.71..2535075.21 rows=1000 width=40) (actual time=275.590..275.592 rows=1 loops=1)
Output: s.s, q.change
Sort Key: s.s
Sort Method: quicksort Memory: 25kB
-> Nested Loop Left Join (cost=2534.99..2535022.88 rows=1000 width=40) (actual time=275.557..275.559 rows=1 loops=1)
Output: s.s, q.change
Inner Unique: true
-> Function Scan on pg_catalog.generate_series s (cost=0.01..10.01 rows=1000 width=8) (actual time=268.958..268.959 rows=1 loops=1)
Output: s.s
Function Call: generate_series('2022-01-01 00:00:00+00'::timestamp with time zone, ('2022-02-01 00:00:00+00'::timestamp with time zone - '16 days'::interval), '16 days'::interval)
-> Subquery Scan on q (cost=2534.98..2535.00 rows=1 width=40) (actual time=6.586..6.587 rows=1 loops=1)
Output: q.s, q.change
Filter: (s.s = q.s)
-> Aggregate (cost=2534.98..2534.99 rows=1 width=40) (actual time=6.583..6.584 rows=1 loops=1)
Output: s.s, COALESCE(sum(wallets_history.balance_change), '0'::numeric)
-> Bitmap Heap Scan on public.wallets_history (cost=27.85..2532.19 rows=1115 width=5) (actual time=0.901..4.737 rows=23040 loops=1)
Output: wallets_history.wallet_id, wallets_history.code, wallets_history.asset_id, wallets_history.tx_id, wallets_history.seen_at, wallets_history.blk_height, wallets_history.blk_idx, wallets_history.bala
nce_change, wallets_history.balance_total, wallets_history.nth
Recheck Cond: ((s.s Bitmap Index Scan on wallets_history_by_seen_at (cost=0.00..27.57 rows=1115 width=0) (actual time=0.864..0.864 rows=23040 loops=1)
Index Cond: ((wallets_history.seen_at >= s.s) AND (wallets_history.seen_at < (s.s + '16 days'::interval)))
Planning Time: 2.333 ms
JIT:
Functions: 27
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: G

Code Snippets

-- Technically this LIMIT clause is useless. However, without it the query planner
-- is unable to correctly estimate the numbers of row in generate_series
-- which cause JIT compilation, slowing down the query considerably
-- See https://dba.stackexchange.com/questions/310235/why-is-my-nested-loop-taking-so-much-time/310242#310242
CREATE OR REPLACE FUNCTION generate_series_fixed(in_from TIMESTAMPTZ, in_to TIMESTAMPTZ, in_interval INTERVAL) RETURNS TABLE(s TIMESTAMPTZ) AS $$
  SELECT generate_series(in_from, in_to, in_interval)
  LIMIT  (EXTRACT(EPOCH FROM (in_to - in_from))/EXTRACT(EPOCH FROM in_interval)) + 1; -- I am unsure about the exact formula, but over estimating 1 row is fine...
$$ LANGUAGE SQL STABLE;
WindowAgg  (cost=2543486.92..2543511.92 rows=1000 width=24) (actual time=305.248..305.252 rows=1 loops=1)
   Output: s.s, (q.change)::bigint, ((sum(q.change) OVER (?) + COALESCE($0, '0'::numeric)))::bigint
   InitPlan 1 (returns $0)
     ->  Limit  (cost=8414.21..8414.21 rows=1 width=23) (actual time=29.647..29.648 rows=1 loops=1)
           Output: wallets_history_1.balance_total, wallets_history_1.seen_at, wallets_history_1.blk_height, wallets_history_1.blk_idx
           ->  Sort  (cost=8414.21..8662.97 rows=99505 width=23) (actual time=29.646..29.646 rows=1 loops=1)
                 Output: wallets_history_1.balance_total, wallets_history_1.seen_at, wallets_history_1.blk_height, wallets_history_1.blk_idx
                 Sort Key: wallets_history_1.seen_at DESC, wallets_history_1.blk_height DESC, wallets_history_1.blk_idx DESC
                 Sort Method: top-N heapsort  Memory: 25kB
                 ->  Bitmap Heap Scan on public.wallets_history wallets_history_1  (cost=1871.58..7916.68 rows=99505 width=23) (actual time=3.425..19.254 rows=98750 loops=1)
                       Output: wallets_history_1.balance_total, wallets_history_1.seen_at, wallets_history_1.blk_height, wallets_history_1.blk_idx
                       Recheck Cond: (wallets_history_1.seen_at < '2022-01-01 00:00:00+00'::timestamp with time zone)
                       Filter: ((wallets_history_1.wallet_id = 'WHALE'::text) AND (wallets_history_1.code = 'BTC'::text) AND (wallets_history_1.asset_id = ''::text))
                       Heap Blocks: exact=1800
                       ->  Bitmap Index Scan on wallets_history_by_seen_at  (cost=0.00..1846.71 rows=99505 width=0) (actual time=3.272..3.272 rows=98750 loops=1)
                             Index Cond: (wallets_history_1.seen_at < '2022-01-01 00:00:00+00'::timestamp with time zone)
   ->  Sort  (cost=2535072.71..2535075.21 rows=1000 width=40) (actual time=275.590..275.592 rows=1 loops=1)
         Output: s.s, q.change
         Sort Key: s.s
         Sort Method: quicksort  Memory: 25kB
         ->  Nested Loop Left Join  (cost=2534.99..2535022.88 rows=1000 width=40) (actual time=275.557..275.559 rows=1 loops=1)
               Output: s.s, q.change
               Inner Unique: true
               ->  Function Scan on pg_catalog.generate_series s  (cost=0.01..10.01 rows=1000 width=8) (actual time=268.958..268.959 rows=1 loops=1)
                     Output: s.s
                     Function Call: generate_series('2022-01-01 00:00:00+00'::timestamp with time zone, ('2022-02-01 00:00:00+00'::timestamp with time zone - '16 days'::interval), '16 days'::interval)
               ->  Subquery Scan on q  (cost=2534.98..2535.00 rows=1 width=40) (actual time=6.586..6.587 rows=1 loops=1)
                     Output: q.s, q.change
                     Filter: (s.s = q.s)
                     ->  Aggregate  (cost=2534.98..2534.99 rows=1 width=40) (actual time=6.583..6.584 rows=1 loops=1)
                           Output: s.s, COALESCE(
WindowAgg  (cost=2543486.92..2543511.92 rows=1000 width=24) (actual time=39.259..39.264 rows=1 loops=1)
   Output: s.s, (q.change)::bigint, ((sum(q.change) OVER (?) + COALESCE($0, '0'::numeric)))::bigint
   InitPlan 1 (returns $0)
     ->  Limit  (cost=8414.21..8414.21 rows=1 width=23) (actual time=32.375..32.376 rows=1 loops=1)
           Output: wallets_history_1.balance_total, wallets_history_1.seen_at, wallets_history_1.blk_height, wallets_history_1.blk_idx
           ->  Sort  (cost=8414.21..8662.97 rows=99505 width=23) (actual time=32.374..32.374 rows=1 loops=1)
                 Output: wallets_history_1.balance_total, wallets_history_1.seen_at, wallets_history_1.blk_height, wallets_history_1.blk_idx
                 Sort Key: wallets_history_1.seen_at DESC, wallets_history_1.blk_height DESC, wallets_history_1.blk_idx DESC
                 Sort Method: top-N heapsort  Memory: 25kB
                 ->  Bitmap Heap Scan on public.wallets_history wallets_history_1  (cost=1871.58..7916.68 rows=99505 width=23) (actual time=3.147..21.561 rows=98750 loops=1)
                       Output: wallets_history_1.balance_total, wallets_history_1.seen_at, wallets_history_1.blk_height, wallets_history_1.blk_idx
                       Recheck Cond: (wallets_history_1.seen_at < '2022-01-01 00:00:00+00'::timestamp with time zone)
                       Filter: ((wallets_history_1.wallet_id = 'WHALE'::text) AND (wallets_history_1.code = 'BTC'::text) AND (wallets_history_1.asset_id = ''::text))
                       Heap Blocks: exact=1800
                       ->  Bitmap Index Scan on wallets_history_by_seen_at  (cost=0.00..1846.71 rows=99505 width=0) (actual time=3.005..3.005 rows=98750 loops=1)
                             Index Cond: (wallets_history_1.seen_at < '2022-01-01 00:00:00+00'::timestamp with time zone)
   ->  Sort  (cost=2535072.71..2535075.21 rows=1000 width=40) (actual time=6.875..6.877 rows=1 loops=1)
         Output: s.s, q.change
         Sort Key: s.s
         Sort Method: quicksort  Memory: 25kB
         ->  Nested Loop Left Join  (cost=2534.99..2535022.88 rows=1000 width=40) (actual time=6.870..6.873 rows=1 loops=1)
               Output: s.s, q.change
               Inner Unique: true
               ->  Function Scan on pg_catalog.generate_series s  (cost=0.01..10.01 rows=1000 width=8) (actual time=0.023..0.023 rows=1 loops=1)
                     Output: s.s
                     Function Call: generate_series('2022-01-01 00:00:00+00'::timestamp with time zone, ('2022-02-01 00:00:00+00'::timestamp with time zone - '16 days'::interval), '16 days'::interval)
               ->  Subquery Scan on q  (cost=2534.98..2535.00 rows=1 width=40) (actual time=6.846..6.848 rows=1 loops=1)
                     Output: q.s, q.change
                     Filter: (s.s = q.s)
                     ->  Aggregate  (cost=2534.98..2534.99 rows=1 width=40) (actual time=6.845..6.846 rows=1 loops=1)
                           Output: s.s, COALESCE(sum(wallets_hi
CREATE OR REPLACE FUNCTION get_wallets_histogram(in_wallet_id TEXT, in_code TEXT, in_asset_id TEXT, in_from TIMESTAMPTZ, in_to TIMESTAMPTZ, in_interval INTERVAL)
RETURNS TABLE(date TIMESTAMPTZ, balance_change BIGINT, balance BIGINT) AS $$
  SELECT s AS time,
        change::bigint,
        (SUM (q.change) OVER (ORDER BY s) + COALESCE((SELECT balance_total FROM wallets_history WHERE seen_at < in_from AND wallet_id=in_wallet_id AND code=in_code AND asset_id=in_asset_id ORDER BY seen_at DESC, blk_height DESC, blk_idx DESC LIMIT 1), 0))::BIGINT  AS balance
  FROM (-- Technically this LIMIT clause is useless. However, without it the query planner
        -- is unable to correctly estimate the numbers of row in generate_series
        -- which cause JIT compilation, slowing down the query considerably
        SELECT * FROM generate_series(in_from,
        in_to - in_interval,
        in_interval) s LIMIT (EXTRACT(EPOCH FROM (in_to - in_from))/EXTRACT(EPOCH FROM in_interval))) s
  LEFT JOIN LATERAL (
      SELECT s, COALESCE(SUM(balance_change),0) change FROM wallets_history
      WHERE  s <= seen_at AND seen_at < s + in_interval AND wallet_id=in_wallet_id AND code=in_code AND asset_id=in_asset_id
  ) q USING (s)
$$ LANGUAGE SQL STABLE;
-- Technically this LIMIT clause is useless. However, without it the query planner
-- is unable to correctly estimate the numbers of row in generate_series
-- which cause JIT compilation, slowing down the query considerably
-- See https://dba.stackexchange.com/questions/310235/why-is-my-nested-loop-taking-so-much-time/310242#310242
CREATE OR REPLACE FUNCTION generate_series_fixed(in_from TIMESTAMPTZ, in_to TIMESTAMPTZ, in_interval INTERVAL) RETURNS TABLE(s TIMESTAMPTZ) AS $$
  SELECT generate_series(in_from, in_to, in_interval)
  LIMIT  (EXTRACT(EPOCH FROM (in_to - in_from))/EXTRACT(EPOCH FROM in_interval)) + 1; -- I am unsure about the exact formula, but over estimating 1 row is fine...
$$ LANGUAGE SQL STABLE;

Context

StackExchange Database Administrators Q#310235, answer score: 5

Revisions (0)

No revisions yet.