HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

An alternative to querying with multiple JOIN over the same table?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
samethealternativewithjoinqueryingmultipleovertable

Problem

I have a Postgresql 11 database.
Let's say I have a table called houses. It should have hundreds of thousands of records.

CREATE TABLE houses (
  pkid serial primary key,
  address varchar(255) NOT NULL,
  rent float NOT NULL
);


Now, my houses have features I want to register in the database. As the list of possible features will be quite long (several dozens) and will evolve over time, as I don't want add a long list of columns to the table houses and change the table constantly with 'ALTER TABLE', I thought of having a separate table for these features :

CREATE TABLE house_features (
   pkid serial primary key,
   house_pkid integer NOT NULL,
   feature_name varchar(255) NOT NULL,
   feature_value varchar(255)
);
CREATE INDEX ON house_features (feature_name, feature_value);
ALTER TABLE house_features ADD CONSTRAINT features_fk FOREIGN KEY (house_pkid) REFERENCES houses (pkid) ON DELETE CASCADE;


In average, each house record will have 10-20 records in the house_features table.

So far, this seems a simple efficient model : I can add as many different features, controlling the possible values of feature_name and feature_value in the upper layers (the applicative layer and/or the GUI). I don't have to alter the database each time the application evolves and I need a new type of feature.

For the example, let's say I have the following features :

  • feature_name : 'rooftype' with possible feature_value : 'flat' or 'inclined'



  • feature_name : 'wallcolors' with possible feature_value : 'white', 'beige', 'blue', 'green', etc.. (15 different possible values)



  • feature_name : 'has_basement' with possible feature_value : 'True' or 'False'.



  • feature_name : 'number_of_doors' with possible feature_value any integer coded as a string (so '0', '1', '2', ...).



  • feature_name : 'floor_surface' with possible feature_value any given float coded as a string (e.g.: '155.2')



Obviously, storing booleans, integers and floats as strings is not very efficient an

Solution

You could try to aggregate the features into a JSON value, then searching for a combination of multiple features is quite easy:

select h.*, hf.features
from houses
  join (
    select house_id, jsonb_object_agg(feature_name, feature_value) as features
    from house_features
    group by house_id
  ) hf on hf.house_pkid = h.pkid 
where hf.features @> '{"rooftype": "flat", "has_basement", "true", "wallcolors": "white"}';


Performance can be improved by adding a WHERE clause to the sub-select which repeats the feature names, e.g:

where feature_name in ('rooftype', 'has_basement', 'wallcolors')


or even

where (feature_name, feature_value) in (('rooftype', 'flat') ('has_basement', 'true'), ('wallcolors', 'white'))


The outer condition is still necessary, because the inner where will include houses that don't have all the features.

This also has the advantage (in my eyes) that you only get one row with all the features, rather then one row for each feature.

Unless you remove, add and change features for a house very frequently, storing them as a single JSONB column on the house table (features) and getting rid of the house_features table, might be an alternative. In that case you could create an index on the column to speed up the search.

Code Snippets

select h.*, hf.features
from houses
  join (
    select house_id, jsonb_object_agg(feature_name, feature_value) as features
    from house_features
    group by house_id
  ) hf on hf.house_pkid = h.pkid 
where hf.features @> '{"rooftype": "flat", "has_basement", "true", "wallcolors": "white"}';
where feature_name in ('rooftype', 'has_basement', 'wallcolors')
where (feature_name, feature_value) in (('rooftype', 'flat') ('has_basement', 'true'), ('wallcolors', 'white'))

Context

StackExchange Database Administrators Q#267840, answer score: 6

Revisions (0)

No revisions yet.