HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlMinor

How to make Postgres Multi-Column and Multi-Table Search More Efficient

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
multipostgrescolumnsearchmakemoreefficienthowandtable

Problem

I have a Shipment table with some basic data about a shipment and a ShipmentItem table which contains additional attributes about that shipment with a foreignKey on the Shipment table's primaryKey. The Shipment to ShipmentItem table is OneToMany relationship.

We have a requirement to include a text search option that takes a given inputted text string and searches over 2 of the (make) Shipment's columns in addition to three specific types of the ShipmentItem name column. This is my current query:

select *
from Shipment shipment
where shipment.deliveryRequestedDate >= '2019-06-09T00:00:00Z'
  and shipment.deliveryRequestedDate = '2019-06-09T00:00:00Z'
            and shipmentItem.deliveryRequestedDate <= '2019-12-06T23:59:59Z'
       )
    )
limit 25


The issue I've discovered is the combination of putting a subQuery as one of the or conditions is causing a MAJOR performance problem (even though the subQuery itself returns fast by utilizing a type_name_deliveryRequestedDate index on that table. Although we have multiple indexes on the main table (identifierKeyValues, carrierReferenceNumber, and even an index over all three Shipment columns queried over, it will only use the deliveryRequestedDate index which is extremely inefficient since the range is so large for this query.

Converting this into a JOIN has seemed to cause the same behavior. I'm just not quite sure what the best approach is at this point. We have a Java Persistence API layer above this query so would like to avoid making any major changes to the data model if possible but not sure what the best approach is. Any ideas would be greatly appreciated!

EXPLAIN PLAN:

```
Limit (cost=110.61..209.98 rows=25 width=1370) (actual time=119503.030..124034.809 rows=1 loops=1)
-> Index Scan using shipment_deliveryrequesteddate_idx on shipment shipment (cost=110.61..890840.18 rows=224084 width=1370) (actual time=119503.027..124034.805 rows=1 loops=1)
Inde

Solution

It can't use a BitmapOr against a scan on a different table (or at least, it wasn't coded to be able to do that--perhaps it could be made to do that if someone put in the work--it would have to look up the UUID in the other table, then convert them to tids on the ipso table and stuff them into the bitmap), so it can't use the BitmapOr plan.

Your best bet is probably to write this as a UNION ALL of two different queries, one that just hits the single table and one that hits both.

Context

StackExchange Database Administrators Q#248541, answer score: 5

Revisions (0)

No revisions yet.