HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlMinor

How can I remove only consecutive duplicate fields when the id is not consecutive in MySQL

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
cantheduplicatefieldsremovemysqlwhenhownotonly

Problem

In this table, I have a price change history. The prices are checked regularly and as you can see, sometimes the price does not change. How can I take a specific sku and delete all cases where a duplicate price is consecutively measured. As you can see, on 08-08, the price changed to 79.47. Then, it stayed that way until 08-11 when it changed to 79.41. I don't need all the repeated 79.47 since I can just assume that the price didn't change. So, I just need each point in time that the price changed, and that will allow me to save a lot of space.

```
╔══════════╦════════╦════════╦═════════════════════╗
║ id ║ sku ║ amount ║ date ║
╠══════════╬════════╬════════╬═════════════════════╣
║ 225200 ║ 388744 ║ 60 ║ 2014-08-02 22:08:18 ║
║ 1571656 ║ 388744 ║ 79.47 ║ 2014-08-06 20:53:36 ║
║ 1572002 ║ 388744 ║ 79.47 ║ 2014-08-06 20:53:42 ║
║ 2092567 ║ 388744 ║ 79.45 ║ 2014-08-07 01:22:13 ║
║ 2608362 ║ 388744 ║ 79.44 ║ 2014-08-07 05:50:45 ║
║ 4403138 ║ 388744 ║ 79.44 ║ 2014-08-07 18:59:02 ║
║ 4935625 ║ 388744 ║ 79.44 ║ 2014-08-07 23:18:56 ║
║ 5356014 ║ 388744 ║ 79.47 ║ 2014-08-08 03:41:30 ║
║ 5703764 ║ 388744 ║ 79.47 ║ 2014-08-08 08:00:47 ║
║ 8559295 ║ 388744 ║ 79.47 ║ 2014-08-09 22:40:42 ║
║ 8878123 ║ 388744 ║ 79.47 ║ 2014-08-10 02:54:05 ║
║ 9204053 ║ 388744 ║ 79.47 ║ 2014-08-10 07:13:06 ║
║ 9538507 ║ 388744 ║ 79.47 ║ 2014-08-10 11:27:51 ║
║ 9863478 ║ 388744 ║ 79.47 ║ 2014-08-10 15:44:34 ║
║ 10184051 ║ 388744 ║ 79.47 ║ 2014-08-10 20:02:29 ║
║ 10503334 ║ 388744 ║ 79.41 ║ 2014-08-11 00:18:44 ║
║ 10821782 ║ 388744 ║ 79.33 ║ 2014-08-11 04:34:22 ║
║ 11135386 ║ 388744 ║ 79.33 ║ 2014-08-11 08:52:31 ║
║ 11446160 ║ 388744 ║ 79.33 ║ 2014-08-11 13:07:56 ║
║ 11760103 ║ 388744 ║ 79.33 ║ 2014-08-11 17:26:09 ║
║ 12074366 ║ 388744 ║ 79.33 ║ 2014-08-11 21:42:37 ║
║ 12399508 ║ 388744 ║ 79.47 ║ 2014-08-12 02:13:05 ║
║ 12726720 ║ 388744 ║ 79.47 ║ 2014-08-12 06:45:28 ║
║ 12726970 ║ 388744 ║ 79.47 ║ 2014-08-12 06:45:37 ║
║ 13059

Solution

This should do the job :

SELECT
  ph1.sku, 
  ph1.id as ph1Id, 
  ph2.id as ph2Id, 
  ph1.histDate,
  ph2.amount as oldAmount, 
  ph1.amount as newAmount 
FROM priceHistory as ph1 
LEFT JOIN priceHistory as ph2 ON
ph2.id = 
(SELECT max(id) FROM priceHistory AS ph3 WHERE ph3.id  ph2.amount OR 
      ph2.amount is null


Here is the fiddle.

Code Snippets

SELECT
  ph1.sku, 
  ph1.id as ph1Id, 
  ph2.id as ph2Id, 
  ph1.histDate,
  ph2.amount as oldAmount, 
  ph1.amount as newAmount 
FROM priceHistory as ph1 
LEFT JOIN priceHistory as ph2 ON
ph2.id = 
(SELECT max(id) FROM priceHistory AS ph3 WHERE ph3.id < ph1.id and ph1.SKU = ph3.SKU)
WHERE ph1.amount <> ph2.amount OR 
      ph2.amount is null

Context

StackExchange Database Administrators Q#82820, answer score: 2

Revisions (0)

No revisions yet.