patternMinor
PostgreSQL 9.4 analysis, performance of normal column, indexed column and jsonb key
Viewed 0 times
postgresqlcolumnanalysisnormalindexedperformanceandjsonbkey
Problem
If I have a table containing:
and
which one would be faster? querying
Could anyone do an analysis for me for those case and case below?
integer test
string test
```
EXPLAIN SELECT * FROM test WHERE name LIKE '%aaa%' ORDER BY random() LIMIT 30;
QUERY PLAN
CREATE TABLE test(
id SERIAL PRIMARY KEY,
name VARCHAR(200),
age INT,
data JSONB
);and
data column populated with {"name": xxx, "age": yyy}, sometimes {"name": xxx, "age": yyy, "somethingElse": zzz}which one would be faster? querying
test.data->>name or test.name ? and test.data->>age or test.age?Could anyone do an analysis for me for those case and case below?
INSERT INTO test(name,age,data)
SELECT
z
, gs.y
, ('{"name":"' || z || '","age":' || gs.y || '}')::JSONB
FROM (SELECT md5(z::text) z, random() AS y FROM generate_series(1,1000000) AS gs(z)) AS gs;
EXPLAIN SELECT * FROM test ORDER BY random() LIMIT 30;
QUERY PLAN
---------------------------------------------------------------------------
Limit (cost=30412.00..30412.07 rows=30 width=458)
-> Sort (cost=30412.00..31139.32 rows=290928 width=458)
Sort Key: (random())
-> Seq Scan on test (cost=0.00..21819.60 rows=290928 width=458)integer test
EXPLAIN SELECT * FROM test WHERE age Sort (cost=43861.14..44694.47 rows=333333 width=116)
Sort Key: (random())
-> Seq Scan on test (cost=0.00..34016.33 rows=333333 width=116)
Filter: ((age)::numeric >'age')::FLOAT Sort (cost=48861.14..49694.47 rows=333333 width=116)
Sort Key: (random())
-> Seq Scan on test (cost=0.00..39016.33 rows=333333 width=116)
Filter: (((data ->> 'age'::text))::double precision >'age')::NUMERIC Sort (cost=48861.14..49694.47 rows=333333 width=116)
Sort Key: (random())
-> Seq Scan on test (cost=0.00..39016.33 rows=333333 width=116)
Filter: (((data ->> 'age'::text))::numeric < 0.5)string test
```
EXPLAIN SELECT * FROM test WHERE name LIKE '%aaa%' ORDER BY random() LIMIT 30;
QUERY PLAN
Solution
- Table definition
Better:
CREATE TABLE test(
test_id serial PRIMARY KEY,
age INT,
name text,
data JSONB
);Due to alignment requirements of the types
integer and varchar / text it's better to put the two int columns first and together. More:- Configuring PostgreSQL for read performance
Also, "age" is a dubious column to begin with. Normally it's better to store a "birthday" or "creation_date" as absolute information, while "age" is rotting quickly.
- Test data
INSERT INTO test(name,age,data)
SELECT name, age, row_to_json(sub)::jsonb
FROM (
SELECT md5(g::text) AS name, (random() * 100)::int AS age
FROM generate_series(1,1000000) g
) sub;-
row_to_json() is much more elegant than building the string by hand.-
If you cast
random() to integer, you get 0 or 1. Largely useless data for any kind of test. In particular, indexes are useless with just two distinct common values. Multiplying by 100 to get something halfway useful.- Why are indexes not used?
Your btree index:
CREATE INDEX test_name ON test(name);is useless for
LIKE queries that are not left-anchored (name LIKE '%aaa%'). Details here:- PostgreSQL LIKE query performance variations
- How is LIKE implemented?
- Pattern matching with LIKE, SIMILAR TO or regular expressions
- Difference between LIKE and ~ in Postgres
Your btree index:
CREATE INDEX test_age on test(age);was useless, since you only had
0 and 1 in your column age, and an idex is typically only used if a small portion (~ below 5 %, it depends) of the table is expected to qualify. Try again with more realistic data.- Test method
EXPLAIN only shows query plans. To get actual performance use EXPLAIN ANALYZE or, for most accurate total times: EXPLAIN ( ANALYZE, TIMING OFF).General advice
Columns with scalar values are generally much smaller and faster in every respect than document types like
jsonb. jsonb (pg 9.4) has received major improvements over json, but this is still true. Queries with predicates on values inside a JSON object are slower than predicates on scalar columns. With optimized (functional) indexes, jsonb can get close, but it will never be as fast.Code Snippets
CREATE TABLE test(
test_id serial PRIMARY KEY,
age INT,
name text,
data JSONB
);INSERT INTO test(name,age,data)
SELECT name, age, row_to_json(sub)::jsonb
FROM (
SELECT md5(g::text) AS name, (random() * 100)::int AS age
FROM generate_series(1,1000000) g
) sub;CREATE INDEX test_name ON test(name);CREATE INDEX test_age on test(age);Context
StackExchange Database Administrators Q#83842, answer score: 3
Revisions (0)
No revisions yet.