snippetsqlMinor
How to make Postgres Multi-Column and Multi-Table Search More Efficient
Viewed 0 times
multipostgrescolumnsearchmakemoreefficienthowandtable
Problem
I have a
We have a requirement to include a text search option that takes a given inputted text string and searches over 2 of the (make)
The issue I've discovered is the combination of putting a subQuery as one of the
Converting this into a JOIN has seemed to cause the same behavior. I'm just not quite sure what the best approach is at this point. We have a Java Persistence API layer above this query so would like to avoid making any major changes to the data model if possible but not sure what the best approach is. Any ideas would be greatly appreciated!
EXPLAIN PLAN:
```
Limit (cost=110.61..209.98 rows=25 width=1370) (actual time=119503.030..124034.809 rows=1 loops=1)
-> Index Scan using shipment_deliveryrequesteddate_idx on shipment shipment (cost=110.61..890840.18 rows=224084 width=1370) (actual time=119503.027..124034.805 rows=1 loops=1)
Inde
Shipment table with some basic data about a shipment and a ShipmentItem table which contains additional attributes about that shipment with a foreignKey on the Shipment table's primaryKey. The Shipment to ShipmentItem table is OneToMany relationship.We have a requirement to include a text search option that takes a given inputted text string and searches over 2 of the (make)
Shipment's columns in addition to three specific types of the ShipmentItem name column. This is my current query:select *
from Shipment shipment
where shipment.deliveryRequestedDate >= '2019-06-09T00:00:00Z'
and shipment.deliveryRequestedDate = '2019-06-09T00:00:00Z'
and shipmentItem.deliveryRequestedDate <= '2019-12-06T23:59:59Z'
)
)
limit 25The issue I've discovered is the combination of putting a subQuery as one of the
or conditions is causing a MAJOR performance problem (even though the subQuery itself returns fast by utilizing a type_name_deliveryRequestedDate index on that table. Although we have multiple indexes on the main table (identifierKeyValues, carrierReferenceNumber, and even an index over all three Shipment columns queried over, it will only use the deliveryRequestedDate index which is extremely inefficient since the range is so large for this query.Converting this into a JOIN has seemed to cause the same behavior. I'm just not quite sure what the best approach is at this point. We have a Java Persistence API layer above this query so would like to avoid making any major changes to the data model if possible but not sure what the best approach is. Any ideas would be greatly appreciated!
EXPLAIN PLAN:
```
Limit (cost=110.61..209.98 rows=25 width=1370) (actual time=119503.030..124034.809 rows=1 loops=1)
-> Index Scan using shipment_deliveryrequesteddate_idx on shipment shipment (cost=110.61..890840.18 rows=224084 width=1370) (actual time=119503.027..124034.805 rows=1 loops=1)
Inde
Solution
It can't use a BitmapOr against a scan on a different table (or at least, it wasn't coded to be able to do that--perhaps it could be made to do that if someone put in the work--it would have to look up the UUID in the other table, then convert them to tids on the ipso table and stuff them into the bitmap), so it can't use the BitmapOr plan.
Your best bet is probably to write this as a UNION ALL of two different queries, one that just hits the single table and one that hits both.
Your best bet is probably to write this as a UNION ALL of two different queries, one that just hits the single table and one that hits both.
Context
StackExchange Database Administrators Q#248541, answer score: 5
Revisions (0)
No revisions yet.