HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlMinor

How to get rid of using temporary when using DISTINCT in MySQL?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
riddistincttemporarymysqlgetusinghowwhen

Problem

When I do the following, I get Using where; Using index; Using temporary. How do I get rid of Using temporary?

http://sqlfiddle.com/#!8/f61a2/3/0

CREATE TABLE test (
a varchar(45) NOT NULL,
b varchar(45) NOT NULL,
date date NOT NULL,
PRIMARY KEY (
a,b),
KEY
index1 (b,date),
KEY
index2 (b,date,a)
) ENGINE=InnoDB;

INSERT INTO test (a, b, date) VALUES ('a1', 'b1', now());
INSERT INTO test (a, b, date) VALUES ('a1', 'b2', now());
INSERT INTO test (a, b, date) VALUES ('a2', 'b1', now());
INSERT INTO test (a, b, date) VALUES ('a2', 'b2', now());

INSERT INTO test (a, b, date) VALUES ('a3', 'b1', '2000-01-01 01:01:01');
INSERT INTO test (a, b, date) VALUES ('a3', 'b2', '2000-01-01 01:01:01');


EXPLAIN SELECT
DISTINCT a
FROM test
WHERE b IN ('b1', 'b2') AND
date > NOW() - INTERVAL 1 MONTH;

Solution

Do you have performance issues because this sounds like premature optimization?

The only good single query what uses indexes is an UNION (which will also filter out duplicates).

If you run EXPLAIN on this query, using temporary won't pop up in the explain..

SELECT
 a
FROM test
WHERE b = 'b1' AND
date > NOW() - INTERVAL 1 MONTH

UNION

SELECT
 a
FROM test
WHERE b = 'b2' AND
date > NOW() - INTERVAL 1 MONTH
;


Edited 2019-09-01


Demo is not valid anymore as sqlfiddle does not run the old MySQL version anymore
and because MySQL source code was updated...


I would suggest looking into db-fiddle and switch between MySQL versions 5.5/5.6/5.7, then the problem should be visible between 5.5 and 5.6

By looking into source code file sql/union.cc (C++ code) of MySQL version 5.7.2 m12 (most likely (all) lower versions also using this code, but not sure about that), we can see how UNION works in MySQL. You will discover that the EXPLAIN output sometimes tells you lies, because the source code (in one of the input parameters) indicates that:


is_union_distinct if set, the temporary table will eliminate
duplicates on insert

This makes sense because duplicated records are filtered out. Source code snapshot is below:

/*
  Create a temporary table to store the result of select_union.

  SYNOPSIS
    select_union::create_result_table()
      thd                thread handle
      column_types       a list of items used to define columns of the
                         temporary table
      is_union_distinct  if set, the temporary table will eliminate
                         duplicates on insert
      options            create options
      table_alias        name of the temporary table
      bit_fields_as_long convert bit fields to ulonglong

  DESCRIPTION
    Create a temporary table that is used to store the result of a UNION,
    derived table, or a materialized cursor.

  RETURN VALUE
    0                    The table has been created successfully.
    1                    create_tmp_table failed.
*/

bool
select_union::create_result_table(THD *thd_arg, List *column_types,
                                  bool is_union_distinct, ulonglong options,
                                  const char *table_alias,
                                  bool bit_fields_as_long, bool create_table)
{
  DBUG_ASSERT(table == 0);
  tmp_table_param.init();
  count_field_types(thd_arg->lex->current_select(), &tmp_table_param,
                    *column_types, false, true);
  tmp_table_param.skip_create_table= !create_table;
  tmp_table_param.bit_fields_as_long= bit_fields_as_long;

  if (! (table= create_tmp_table(thd_arg, &tmp_table_param, *column_types,
                                 (ORDER*) 0, is_union_distinct, 1,
                                 options, HA_POS_ERROR, (char*) table_alias)))
    return TRUE;
  if (create_table)
  {
    table->file->extra(HA_EXTRA_WRITE_CACHE);
    table->file->extra(HA_EXTRA_IGNORE_DUP_KEY);
  }
  return FALSE;

}


There is a function that is called create_tmp_table, and thus MySQL is always creating a temporary table when the function create_result_table is called

Code Snippets

SELECT
 a
FROM test
WHERE b = 'b1' AND
date > NOW() - INTERVAL 1 MONTH

UNION

SELECT
 a
FROM test
WHERE b = 'b2' AND
date > NOW() - INTERVAL 1 MONTH
;
/*
  Create a temporary table to store the result of select_union.

  SYNOPSIS
    select_union::create_result_table()
      thd                thread handle
      column_types       a list of items used to define columns of the
                         temporary table
      is_union_distinct  if set, the temporary table will eliminate
                         duplicates on insert
      options            create options
      table_alias        name of the temporary table
      bit_fields_as_long convert bit fields to ulonglong

  DESCRIPTION
    Create a temporary table that is used to store the result of a UNION,
    derived table, or a materialized cursor.

  RETURN VALUE
    0                    The table has been created successfully.
    1                    create_tmp_table failed.
*/

bool
select_union::create_result_table(THD *thd_arg, List<Item> *column_types,
                                  bool is_union_distinct, ulonglong options,
                                  const char *table_alias,
                                  bool bit_fields_as_long, bool create_table)
{
  DBUG_ASSERT(table == 0);
  tmp_table_param.init();
  count_field_types(thd_arg->lex->current_select(), &tmp_table_param,
                    *column_types, false, true);
  tmp_table_param.skip_create_table= !create_table;
  tmp_table_param.bit_fields_as_long= bit_fields_as_long;

  if (! (table= create_tmp_table(thd_arg, &tmp_table_param, *column_types,
                                 (ORDER*) 0, is_union_distinct, 1,
                                 options, HA_POS_ERROR, (char*) table_alias)))
    return TRUE;
  if (create_table)
  {
    table->file->extra(HA_EXTRA_WRITE_CACHE);
    table->file->extra(HA_EXTRA_IGNORE_DUP_KEY);
  }
  return FALSE;

}

Context

StackExchange Database Administrators Q#54178, answer score: 5

Revisions (0)

No revisions yet.