gotchasqlMinor
Why does jsonb use more space for storing integers?
Viewed 0 times
whyspacemoreintegersfordoesusestoringjsonb
Problem
I did some small tests to try these data types, found some puzzling results.
Summary:
jsonb uses more space for integers than strings:
18
28
Why does the size of jsonb inflate when using integers?
Long version:
I was originally interested in using an array of integers, here are those tests. In each pair of comma separated numbers, first corresponds to
8,16
20,26
Check if jsonb actually stores integers as plain text; if so then this next one should use 4 bytes more for jsonb:
22,26
Strange, it uses 2 bytes more, no idea what that means. Try some more practical and realistic data now:
104,40
What happened to jsonb here? What caused it to inflate? Maybe if I use smaller numbers that have less characters...
104,40
Nope. What about strings?
48,40
It seems that jsonb likes strings and not integers.
I quickly made it a rule of thumb that I should only use strings in jsonb and use native integers or arrays for numeric data, but I don't know if
Summary:
jsonb uses more space for integers than strings:
select pg_column_size('{"a":"1"}'::jsonb) as json_size18
select pg_column_size('{"a":1}'::jsonb) as json_size28
Why does the size of jsonb inflate when using integers?
Long version:
I was originally interested in using an array of integers, here are those tests. In each pair of comma separated numbers, first corresponds to
json_size (bytes), second number is array_size (bytes).select pg_column_size('[]'::jsonb) as json_size,
pg_column_size(array[]::smallint[]) as array_size8,16
select pg_column_size('[1]'::jsonb) as json_size,
pg_column_size(array[1]::smallint[]) as array_size20,26
Check if jsonb actually stores integers as plain text; if so then this next one should use 4 bytes more for jsonb:
select pg_column_size('[12345]'::jsonb) as json_size,
pg_column_size(array[12345]::smallint[]) as array_size22,26
Strange, it uses 2 bytes more, no idea what that means. Try some more practical and realistic data now:
select pg_column_size('[123,234,345,456,567,678,789,890]'::jsonb) as json_size,
pg_column_size(array[123,234,345,456,567,678,789,890]::smallint[]) as array_size104,40
What happened to jsonb here? What caused it to inflate? Maybe if I use smaller numbers that have less characters...
select pg_column_size('[1,2,3,4,5,6,7,8]'::jsonb) as json_size,
pg_column_size(array[1,2,3,4,5,6,7,8]::smallint[]) as array_size104,40
Nope. What about strings?
select pg_column_size('["1","2","3","4","5","6","7","8"]'::jsonb) as json_size,
pg_column_size(array[1,2,3,4,5,6,7,8]::smallint[]) as array_size48,40
It seems that jsonb likes strings and not integers.
I quickly made it a rule of thumb that I should only use strings in jsonb and use native integers or arrays for numeric data, but I don't know if
Solution
It's simple. JSON doesn't have a concept of an integer, only a "number". From the docs
When converting textual JSON input into jsonb, the primitive types described by RFC 7159 are effectively mapped onto native PostgreSQL types, as shown in Table 8-23.
That table shows a JSON "number" is mapped to a numeric. The max precision on numeric is
up to 131072 digits before the decimal point; up to 16383 digits after the decimal point
However, I don't think it actually stores that. It does store a numeric of some sort. The numeric source is quite frankly kind of difficult to follow.
Using an array of smallint is clearly better if it fits your workload. It has less wasted space for packing, and less wasted space for the data itself.
When converting textual JSON input into jsonb, the primitive types described by RFC 7159 are effectively mapped onto native PostgreSQL types, as shown in Table 8-23.
That table shows a JSON "number" is mapped to a numeric. The max precision on numeric is
up to 131072 digits before the decimal point; up to 16383 digits after the decimal point
However, I don't think it actually stores that. It does store a numeric of some sort. The numeric source is quite frankly kind of difficult to follow.
Using an array of smallint is clearly better if it fits your workload. It has less wasted space for packing, and less wasted space for the data itself.
smallint[] however- Does not support the same range of numbers.
- Does not support embedding different types.
- Does not support any precision at all.
- Does not exchange well with web-clients.
Context
StackExchange Database Administrators Q#161866, answer score: 4
Revisions (0)
No revisions yet.