HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Can jsonb do fast search by *value* or even a partial value? Other types?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
fasttypescansearchvaluepartialevenotherjsonb

Problem

I have a person table where arbitrary data must be inserted. Since I'm using Postgres 9.4, jsonb appears to be the right choice.

Example data:

id: 1, name: "Joe Doe", preferences: { color: "red" , toy: "car"}
id: 2, name: "Jane Doe", preferences: { color: "blue", food: "hamburguer" }


The problem is that I need to query it by the variable values, example:

All the people who has "hamburger" on preferences, and partial queries must be possible too, so searching for "burg" must bring me record 2.

Using json functions, I can search for values, as long as I known the key, right? Or there is a fast way to search for the values?

What I'm thinking to do is to create a tsvector column and dump the json to use fulltext search to do the partial queries.

Solution

For Postgres 12 or later consider the SQL/JSON path language. See:

  • Pattern-matching for JSON values: slow EXISTS subquery on materialized view



Original answer (for older versions)

I suggest a MATERIALIZED VIEW with unnested values and a trigram index as search tools.
hstore

If you don't need nested values, hstore may be even better for you. Use the function svals(hstore) to unnest hstore values.

You need to install the additional module hstore once per database:

CREATE EXTENSION hstore;


  • Key value pair in PostgreSQL



Table

CREATE TABLE person (
  person_id int PRIMARY KEY
, name text
, preferences jsonb
);
INSERT INTO person VALUES
  (1, 'Joe Doe', jsonb  '{"toy": "car", "color": "red"}')
, (2, 'Jane Doe', '{"food": "hamburguer", "color": "blue"}')
;


Materialized view

CREATE MATERIALIZED VIEW person_pref AS
SELECT p.person_id, j.preference        -- just pref
FROM   person p, svals(p.preferences) j(preference);


This is an implicit CROSS JOIN LATERAL to the set-returning function svals().
Trigram Index

You need to install the additional module pg_trgm once per database:

CREATE EXTENSION pg_trgm;


Then:

CREATE INDEX person_pref_trgm_idx ON person_pref
USING gin (preference gin_trgm_ops);


Details:

  • How is LIKE implemented?



  • Pattern matching with LIKE, SIMILAR TO or regular expressions



Query

SELECT *
FROM   person p
WHERE  EXISTS (
   SELECT FROM person_pref pp
   WHERE  pp.person_id = p.person_id
   AND    pp.preference ILIKE '%burg%'
   );


This is fast for selective search terms.
jsonb

If you have nested values or numeric or boolean values, jsonb may be more efficient. You can do the same as above with jsonb_each_text(jsonb):

CREATE MATERIALIZED VIEW person_pref AS
SELECT p.person_id, j.preference
FROM  person p, jsonb_each_text(p.preferences) AS j(key, preference);


Same index, same query.

db<>fiddle here

Code Snippets

CREATE EXTENSION hstore;
CREATE TABLE person (
  person_id int PRIMARY KEY
, name text
, preferences jsonb
);
INSERT INTO person VALUES
  (1, 'Joe Doe', jsonb  '{"toy": "car", "color": "red"}')
, (2, 'Jane Doe', '{"food": "hamburguer", "color": "blue"}')
;
CREATE MATERIALIZED VIEW person_pref AS
SELECT p.person_id, j.preference        -- just pref
FROM   person p, svals(p.preferences) j(preference);
CREATE EXTENSION pg_trgm;
CREATE INDEX person_pref_trgm_idx ON person_pref
USING gin (preference gin_trgm_ops);

Context

StackExchange Database Administrators Q#108447, answer score: 9

Revisions (0)

No revisions yet.