HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Join twice on same table with different conditions and no PK / FK relations

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
twicesamerelationswithjoindifferentconditionsandtable

Problem

I have the following tables:

  • agencies: | code |



  • commissions: | registered_at | action | origin | destination | commission |



I need to SUM(commissions.commission) for the following conditions

  • agencies.code = commissions.origin HAVING commissions.action = "collection"



  • agencies.code = commissions.destination HAVING commissions.action = "delivery"



WHERE Date(commissions.registered_at) between '2023-01-01 00:00:00' and '2023-01-31 23:59:59'  
Group By agencies.code


As you can see, these 2 tables have no PK / FK relations.

They are related by the destination / origin and the action being queried.

I tried to make a double Left Join on commissions, but that seems to call the second join for the number of records from the first join. I also don't see how I can make it work with IF conditions inside SELECT, as the conditions to join the tables are different.

Any help is appreciated.

Solution

If registered_at is indexed (with registered_at as the first column ot only column) in the commissions table, then there is a better way to expression the WHERE clauses on registered_at.

This WHERE clause

WHERE DATE(registered_at) BETWEEN '2023-01-01 00:00:00' AND '2023-01-31 23:59:59'


will trigger a table scan for sure because of the DATE() function on the left side.

Please change

WHERE DATE(registered_at) BETWEEN '2023-01-01 00:00:00' AND '2023-01-31 23:59:59'


to

WHERE registered_at >= '2023-01-01 00:00:00' AND registered_at <= '2023-01-31 23:59:59'


This will definitely change the EXPLAIN plan to hopefully use the index.

So the query should look like this

SELECT agencies.code, 
       COALESCE(collections.collections, 0) as collections,
       COALESCE(deliveries.deliveries, 0) as deliveries
FROM agencies
LEFT JOIN (
  SELECT origin as code, 
         SUM(IF(action = 'collection', commission, 0)) as collections,
         SUM(IF(action = 'delivery', commission, 0)) as deliveries
  FROM commissions
  WHERE registered_at >= '2023-01-01 00:00:00' AND registered_at = '2023-01-01 00:00:00' AND registered_at <= '2023-01-31 23:59:59'
  GROUP BY destination
) deliveries ON agencies.code = deliveries.code;


If there is no registered_at index in the commissions table, then create one

ALTER TABLE commissions
    ADD INDEX covering_index (registered_at,action,commission)
   ,ALGORITHM=INPLACE
   ,LOCK=NONE
;


This will create short index scans and not touch the table.

Code Snippets

WHERE DATE(registered_at) BETWEEN '2023-01-01 00:00:00' AND '2023-01-31 23:59:59'
WHERE DATE(registered_at) BETWEEN '2023-01-01 00:00:00' AND '2023-01-31 23:59:59'
WHERE registered_at >= '2023-01-01 00:00:00' AND registered_at <= '2023-01-31 23:59:59'
SELECT agencies.code, 
       COALESCE(collections.collections, 0) as collections,
       COALESCE(deliveries.deliveries, 0) as deliveries
FROM agencies
LEFT JOIN (
  SELECT origin as code, 
         SUM(IF(action = 'collection', commission, 0)) as collections,
         SUM(IF(action = 'delivery', commission, 0)) as deliveries
  FROM commissions
  WHERE registered_at >= '2023-01-01 00:00:00' AND registered_at <= '2023-01-31 23:59:59'
  GROUP BY origin
) collections ON agencies.code = collections.code
LEFT JOIN (
  SELECT destination as code, 
         SUM(IF(action = 'collection', commission, 0)) as collections,
         SUM(IF(action = 'delivery', commission, 0)) as deliveries
  FROM commissions
  WHERE registered_at >= '2023-01-01 00:00:00' AND registered_at <= '2023-01-31 23:59:59'
  GROUP BY destination
) deliveries ON agencies.code = deliveries.code;
ALTER TABLE commissions
    ADD INDEX covering_index (registered_at,action,commission)
   ,ALGORITHM=INPLACE
   ,LOCK=NONE
;

Context

StackExchange Database Administrators Q#322823, answer score: 2

Revisions (0)

No revisions yet.