HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Trouble with a NOT IN query

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
withtroublenotquery

Problem

SELECT DISTINCT d.customer_id, d.date_added FROM `order` d
WHERE d.customer_id NOT IN (
    SELECT DISTINCT i.customer_id 
    FROM `order` i 
    WHERE i.date_added > '2015-02-15 14:00:00'
)
ORDER BY d.date_added DESC;


The above query should return customer_id of customers who have not ordered after 15 Feb 2015 (I think). But very first record is

17168, 2015-08-16 17:36:00

What am I doing wrong?

This below query

SELECT DISTINCT i.customer_id,i.date_added FROM `order` i 
WHERE i.date_added > '2015-02-15 14:00:00' 
ORDER BY i.date_added ASC;


returns expected result i.e. list of customer ids for orders placed after 15 Feb

P.S. customer_id can not be NULL

P.P.S.

``
mysql> SHOW CREATE TABLE
order`;
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Solution

This looks like a bug. The improvements to the optimizer regarding anti-joins and subquery materialization, introduced in 5.6, seem relevant. There was a similar bug (#73368), related to subquery materialization but it has been fixed in 5.6.22 version.

I suggest you first, try running the query with each of the following options and check if you get the same unexpected/wrong results:

set optimizer_switch = 'semijoin=off';

set optimizer_switch = 'materialization=off';


Just noticed that it's a MyISAM table, so it's worth checking as well if it is a corruption issue (you can CHECK and/or REPAIR the table).

Then update your installation with the latest 5.6.26 version and check again with default options (and again with the 2 options above).

If you are still getting wrong results, report the bug to MySQL, adding the query, the table CREATE statement, the EXPLAIN output and uploading a dump of the table as well, so they can reproduce it. If you can reproduce the bug with a smaller table (just the 2-3 columns and a few rows) would be much better than a 50-column and a thousands rows table.

Code Snippets

set optimizer_switch = 'semijoin=off';

set optimizer_switch = 'materialization=off';

Context

StackExchange Database Administrators Q#111218, answer score: 4

Revisions (0)

No revisions yet.