HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Improve query performance with index

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
withqueryimproveperformanceindex

Problem

I am looking to create an index for a database to acheive better performance.
I am using this query:

SELECT DISTINCT t1.* FROM current_order
AS t1 LEFT JOIN
receipt AS t2 USING (paper_id,subscriber_id)
WHERE t2.id NOT IN (SELECT id
FROM receipt WHERE paid_till_date > Now())
UNION
SELECT current_order.* FROM receipt RIGHT JOIN
current_order USING (paper_id, subscriber_id)
Where receipt.id IS NULL ORDER BY
subscriber_id, paper_id


Tables are:

PAPER
    id
    name

 SUBSCRIBER
    id
    name
    address
    suburb
    state
    postcode
    round_id

 CURRENT ORDER
    paper_id
    subscriber_id

 ROUND
    id
    name
    paperboy

 RECEIPT
    id
    receipt_date
    paid_till_date
    paper_id
    subscriber_id


I understand that primary keys are already indexed, and that i should use an index where WHERE clauses and ORDER BY clauses are used frequently. So i'm assuming that the subscribers Name would be a good one to index?
And also the receipt date?

Any tips appreciated

Solution

The original query

SELECT DISTINCT t1.* 
  FROM current_order AS t1 
    LEFT JOIN receipt AS t2 USING (paper_id,subscriber_id)
  WHERE t2.id NOT IN (
    SELECT id
      FROM receipt 
      WHERE paid_till_date > Now()
    )
UNION
  SELECT current_order.* 
    FROM receipt 
      RIGHT JOIN current_order USING (paper_id, subscriber_id)
    WHERE receipt.id IS NULL 
    ORDER BY subscriber_id, paper_id


is more complex than necessary.

The base tables are

CURRENT ORDER
  paper_id
  subscriber_id


and

RECEIPT
  id
  receipt_date
  paid_till_date
  paper_id
  subscriber_id


The OP stated that receipt.id is a primary key and I think that (paper_id,subscriber_id) is the primary key of current order.

Both queries

SELECT t1.* 
  FROM current_order AS t1 
    LEFT JOIN receipt AS t2 USING (paper_id,subscriber_id)
  WHERE t2.id NOT IN (
    SELECT id
      FROM receipt 
      WHERE paid_till_date > Now()
    )


and

SELECT t1.* 
  FROM current_order AS t1 
    LEFT JOIN receipt AS t2 USING (paper_id,subscriber_id)
  WHERE paid_till_date <= Now()
    or paid_till_date  is NULL


are equivalent: The resultset f both queries contains exactly the following tuples:

all tuples from current_order where there is a RECEIPT with the same (paper_id,subscriber_id) and paid_till_date <= Now()
all tuples from current_order where there is a RECEIPT with the same (paper_id,subscriber_id) and paid_till_date is NULL
all tuples from current_order where there is no RECEIPT with the same (paper_id,subscriber_id) (and therefore paid_till_date  is NULL)


Therefore the queries are equivalent. And they are also equivalent if an DISTINCT is added to the select clause.

The resultset of the query

SELECT current_order.* 
    FROM receipt 
      RIGHT JOIN current_order USING (paper_id, subscriber_id)
    WHERE receipt.id IS NULL 
    ORDER BY subscriber_id, paper_id


contains the following tuples

all tuples from current_order where there is no RECEIPT with the same (paper_id,subscriber_id) (and therefore receipt.id is NULL)


So the original query can be changed to the simpler query

SELECT DISTINCT t1.* 
  FROM current_order AS t1 
    LEFT JOIN receipt AS t2 USING (paper_id,subscriber_id)
  WHERE paid_till_date <= Now()
    or paid_till_date  is NULL


Perhaps an index on (paper_id,subscriber_id,paid_till_date) will be usefull. The query uses only columns found in the index (receipt.id is not used anymore). An Index on (paper_id,subscriber_id) of "current order" already exists because this is the primary key.

Code Snippets

SELECT DISTINCT t1.* 
  FROM current_order AS t1 
    LEFT JOIN receipt AS t2 USING (paper_id,subscriber_id)
  WHERE t2.id NOT IN (
    SELECT id
      FROM receipt 
      WHERE paid_till_date > Now()
    )
UNION
  SELECT current_order.* 
    FROM receipt 
      RIGHT JOIN current_order USING (paper_id, subscriber_id)
    WHERE receipt.id IS NULL 
    ORDER BY subscriber_id, paper_id
CURRENT ORDER
  paper_id
  subscriber_id
RECEIPT
  id
  receipt_date
  paid_till_date
  paper_id
  subscriber_id
SELECT t1.* 
  FROM current_order AS t1 
    LEFT JOIN receipt AS t2 USING (paper_id,subscriber_id)
  WHERE t2.id NOT IN (
    SELECT id
      FROM receipt 
      WHERE paid_till_date > Now()
    )
SELECT t1.* 
  FROM current_order AS t1 
    LEFT JOIN receipt AS t2 USING (paper_id,subscriber_id)
  WHERE paid_till_date <= Now()
    or paid_till_date  is NULL

Context

StackExchange Database Administrators Q#22877, answer score: 2

Revisions (0)

No revisions yet.