HiveBrain v1.2.0
Get Started
← Back to all entries
gotchasqlMinor

Why does adding a smallint column to a postgres query increase execution time 60X?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
smallintwhy60xpostgrescolumnqueryaddingtimeincreasedoes

Problem

Here I have what seems to me to be a routine table and a table creation script that I have put in /tmp/try.sql. I am running postgres under OSX 10.10.2.

Here is the table:

cow_dev=# \d carc
                                Table "public.carc"
    Column     |     Type     |                         Modifiers
---------------+--------------+---------------------------------------------------   ---------
 internal_id   | integer      | not null default nextval('carc_internal_id_seq'::regclass)
 acct_nbr      | character(6) |
 birth_date    | date         |
 anm_key       | integer      |
 slghtr_dt     | date         |
 sire_assoc_id | integer      |
 sire_reg      | text         |
 dam_assoc_id  | integer      |
 dam_reg       | text         |
 sex           | text         |
 carc_kphf_pct | real         |
 carc_wt       | integer      |
 marbling      | integer      |
 ribeye_area   | real         |
 usda_qlty_grd | smallint     |
 act_fat_thick | real         |
 carcass_group | text         |
 maturity      | integer      |
Indexes:
    "idx_a649568319866892fcdd3742289e8294" PRIMARY KEY, btree (internal_id)
    "idx_d4659e9f750ff68c75e11e89a950e386" btree (anm_key)
    "idx_d57025002b9ce54cf590b76e87c10cac" btree (acct_nbr)
Foreign-key constraints:
    "carc_acct_nbr__acct_acct_nbr_fk" FOREIGN KEY (acct_nbr) REFERENCES acct(acct_nbr)
    "carc_anm_key__anm_anm_key_fk" FOREIGN KEY (anm_key) REFERENCES anm(anm_key)


Here is the script:

```
create temp table carc_out as
with assoc_map as
(select
a.assoc_id,
c.code_3 || a.brd_cd_id mb_name
from
assoc a
join country_code c on c.code_2 = a.country_code_2)
select
c.internal_id,
n.mb_assoc_reg anm_nbr,
c.act_fat_thick,
c.carc_kphf_pct,
c.carcass_group,
c.carc_wt,
null::float carc_yld,
c.marbling,
c.maturity,
c.ribeye_area,
to_char(c.slghtr_dt, 'mmddyyyy') slghtr_dt,

Solution

Short answer: You need a sightly larger work_mem. Try set work_mem in your session.

Explanation:

Compare these two step:

->  Hash  (cost=642.32..642.32 rows=25332 width=51) (actual time=46.444..46.444 rows=25332 loops=1)
                 Output: c.internal_id, c.act_fat_thick, c.carc_kphf_pct, c.carcass_group, c.carc_wt, c.marbling, c.maturity, c.ribeye_area, c.slghtr_dt, c.anm_key, c.dam_assoc_id, c.sire_assoc_id
                 Buckets: 2048  Batches: 128 (originally 4)  Memory Usage: 1025kB
                 ->  Seq Scan on public.carc c  (cost=0.00..642.32 rows=25332 width=51) (actual time=0.007..18.372 rows=25332 loops=1)
                       Output: c.internal_id, c.act_fat_thick, c.carc_kphf_pct, c.carcass_group, c.carc_wt, c.marbling, c.maturity, c.ribeye_area, c.slghtr_dt, c.anm_key, c.dam_assoc_id, c.sire_assoc_id


and

->  Hash  (cost=642.32..642.32 rows=25332 width=53) (actual time=96.305..96.305 rows=25332 loops=1)
                 Output: c.internal_id, c.act_fat_thick, c.carc_kphf_pct, c.carcass_group, c.carc_wt, c.marbling, c.maturity, c.ribeye_area, c.slghtr_dt, c.usda_qlty_grd, c.anm_key, c.dam_assoc_id, c.sire_assoc_id
                 Buckets: 2048  Batches: 65536 (originally 4)  Memory Usage: 1028kB
                 ->  Seq Scan on public.carc c  (cost=0.00..642.32 rows=25332 width=53) (actual time=0.007..15.858 rows=25332 loops=1)
                       Output: c.internal_id, c.act_fat_thick, c.carc_kphf_pct, c.carcass_group, c.carc_wt, c.marbling, c.maturity, c.ribeye_area, c.slghtr_dt, c.usda_qlty_grd, c.anm_key, c.dam_assoc_id, c.sire_assoc_id


You see the slower query use 65536 batches, the faster one use 128.
This is because with the extra field, the batch can't fit in work_mem.. it need to split to smaller batches.

Code Snippets

->  Hash  (cost=642.32..642.32 rows=25332 width=51) (actual time=46.444..46.444 rows=25332 loops=1)
                 Output: c.internal_id, c.act_fat_thick, c.carc_kphf_pct, c.carcass_group, c.carc_wt, c.marbling, c.maturity, c.ribeye_area, c.slghtr_dt, c.anm_key, c.dam_assoc_id, c.sire_assoc_id
                 Buckets: 2048  Batches: 128 (originally 4)  Memory Usage: 1025kB
                 ->  Seq Scan on public.carc c  (cost=0.00..642.32 rows=25332 width=51) (actual time=0.007..18.372 rows=25332 loops=1)
                       Output: c.internal_id, c.act_fat_thick, c.carc_kphf_pct, c.carcass_group, c.carc_wt, c.marbling, c.maturity, c.ribeye_area, c.slghtr_dt, c.anm_key, c.dam_assoc_id, c.sire_assoc_id
->  Hash  (cost=642.32..642.32 rows=25332 width=53) (actual time=96.305..96.305 rows=25332 loops=1)
                 Output: c.internal_id, c.act_fat_thick, c.carc_kphf_pct, c.carcass_group, c.carc_wt, c.marbling, c.maturity, c.ribeye_area, c.slghtr_dt, c.usda_qlty_grd, c.anm_key, c.dam_assoc_id, c.sire_assoc_id
                 Buckets: 2048  Batches: 65536 (originally 4)  Memory Usage: 1028kB
                 ->  Seq Scan on public.carc c  (cost=0.00..642.32 rows=25332 width=53) (actual time=0.007..15.858 rows=25332 loops=1)
                       Output: c.internal_id, c.act_fat_thick, c.carc_kphf_pct, c.carcass_group, c.carc_wt, c.marbling, c.maturity, c.ribeye_area, c.slghtr_dt, c.usda_qlty_grd, c.anm_key, c.dam_assoc_id, c.sire_assoc_id

Context

StackExchange Database Administrators Q#125549, answer score: 3

Revisions (0)

No revisions yet.