HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Selecting Without Repititions

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
repititionswithoutselecting

Problem

I have table with observations of objects moving along edges in a graph, this table has the following form:

PK | TIMESTAMP | object_id | from_id | to_id


where object_id is the id of some object and from_id and to_id are vertices.

Since the movements are observed at a high frequency the tuple

(object_id, from_id, to_id)


is repeated often for different PK and TIMESTAMPS. I'm interested in all the separate edge traversals, so if an object with id 1 moves from vertex 1 to 2, from 2 to 1 and from 1 to 2 I want to have a result:

object_id | from_id | to_id
        1 |       1 |     2
        1 |       2 |     1
        1 |       1 |     2


My question: how to write this query?

Solution

The tuples would have to be ordered by from_id, to_id in such a way that when from_id or to_id changes, it would fall under a different grouping.
I was thinking of writing this as a Stored Procedure, but I thought of something much more intriguing (I just got the idea from my answering this question Update ranking on table about 3 hours ago)

I'll do it in stages

Stage 1 : Sample Data

DROP DATABASE IF EXISTS bootvis;
CREATE DATABASE bootvis;
USE bootvis
CREATE TABLE graph_edges
(
    pk int not null auto_increment,
    object_id int default null,
    tm timestamp not null default current_timestamp,
    from_id int,
    to_id int,
    primary key (pk)
);
INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
SELECT SLEEP(2);
INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
SELECT SLEEP(1);
INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2);
INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1),(2,1),(2,1),(2,1),(2,1),(2,1),(2,1);
INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
UPDATE graph_edges SET object_id = 1 WHERE object_id IS NULL;
INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2);
INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1),(2,1),(2,1),(2,1),(2,1),(2,1),(2,1);
SELECT SLEEP(2);
INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
UPDATE graph_edges SET object_id = 2 WHERE object_id IS NULL;
SELECT * FROM graph_edges;


Stage 2 : Sample Data Loaded

```
mysql> DROP DATABASE IF EXISTS bootvis;
Query OK, 1 row affected (0.03 sec)

mysql> CREATE DATABASE bootvis;
Query OK, 1 row affected (0.00 sec)

mysql> USE bootvis
Database changed
mysql> CREATE TABLE graph_edges
-> (
-> pk int not null auto_increment,
-> object_id int default null,
-> tm timestamp not null default current_timestamp,
-> from_id int,
-> to_id int,
-> primary key (pk)
-> );
Query OK, 0 rows affected (0.12 sec)

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
Query OK, 4 rows affected (0.05 sec)
Records: 4 Duplicates: 0 Warnings: 0

mysql> SELECT SLEEP(2);
+----------+
| SLEEP(2) |
+----------+
| 0 |
+----------+
1 row in set (2.00 sec)

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
Query OK, 2 rows affected (0.05 sec)
Records: 2 Duplicates: 0 Warnings: 0

mysql> SELECT SLEEP(1);
+----------+
| SLEEP(1) |
+----------+
| 0 |
+----------+
1 row in set (1.00 sec)

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2);
Query OK, 2 rows affected (0.07 sec)
Records: 2 Duplicates: 0 Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1),(2,1),(2,1),(2,1),(2,1),(2,1),(2,1);
Query OK, 8 rows affected (0.08 sec)
Records: 8 Duplicates: 0 Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
Query OK, 4 rows affected (0.13 sec)
Records: 4 Duplicates: 0 Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
Query OK, 2 rows affected (0.10 sec)
Records: 2 Duplicates: 0 Warnings: 0

mysql> UPDATE graph_edges SET object_id = 1 WHERE object_id IS NULL;
Query OK, 22 rows affected (0.06 sec)
Rows matched: 22 Changed: 22 Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2);
Query OK, 2 rows affected (0.05 sec)
Records: 2 Duplicates: 0 Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1),(2,1),(2,1),(2,1),(2,1),(2,1),(2,1);
Query OK, 8 rows affected (0.05 sec)
Records: 8 Duplicates: 0 Warnings: 0

mysql> SELECT SLEEP(2);
+----------+
| SLEEP(2) |
+----------+
| 0 |
+----------+
1 row in set (2.00 sec)

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
Query OK, 4 rows affected (0.08 sec)
Records: 4 Duplicates: 0 Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
Query OK, 2 rows affected (0.05 sec)
Records: 2 Duplicates: 0 Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
Query OK, 4 rows affected (0.06 sec)
Records: 4 Duplicates: 0 Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
Query OK, 2 rows affected (0.05 sec)
Records: 2 Duplicates: 0 Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
Query OK, 4 rows affected (0.05 sec)
Records: 4 Duplicates: 0 Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
Query OK, 2 rows affected (0.06 sec)
Records: 2 Duplicates: 0

Code Snippets

DROP DATABASE IF EXISTS bootvis;
CREATE DATABASE bootvis;
USE bootvis
CREATE TABLE graph_edges
(
    pk int not null auto_increment,
    object_id int default null,
    tm timestamp not null default current_timestamp,
    from_id int,
    to_id int,
    primary key (pk)
);
INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
SELECT SLEEP(2);
INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
SELECT SLEEP(1);
INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2);
INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1),(2,1),(2,1),(2,1),(2,1),(2,1),(2,1);
INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
UPDATE graph_edges SET object_id = 1 WHERE object_id IS NULL;
INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2);
INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1),(2,1),(2,1),(2,1),(2,1),(2,1),(2,1);
SELECT SLEEP(2);
INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
UPDATE graph_edges SET object_id = 2 WHERE object_id IS NULL;
SELECT * FROM graph_edges;
mysql> DROP DATABASE IF EXISTS bootvis;
Query OK, 1 row affected (0.03 sec)

mysql> CREATE DATABASE bootvis;
Query OK, 1 row affected (0.00 sec)

mysql> USE bootvis
Database changed
mysql> CREATE TABLE graph_edges
    -> (
    ->     pk int not null auto_increment,
    ->     object_id int default null,
    ->     tm timestamp not null default current_timestamp,
    ->     from_id int,
    ->     to_id int,
    ->     primary key (pk)
    -> );
Query OK, 0 rows affected (0.12 sec)

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
Query OK, 4 rows affected (0.05 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> SELECT SLEEP(2);
+----------+
| SLEEP(2) |
+----------+
|        0 |
+----------+
1 row in set (2.00 sec)

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
Query OK, 2 rows affected (0.05 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> SELECT SLEEP(1);
+----------+
| SLEEP(1) |
+----------+
|        0 |
+----------+
1 row in set (1.00 sec)

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2);
Query OK, 2 rows affected (0.07 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1),(2,1),(2,1),(2,1),(2,1),(2,1),(2,1);
Query OK, 8 rows affected (0.08 sec)
Records: 8  Duplicates: 0  Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
Query OK, 4 rows affected (0.13 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
Query OK, 2 rows affected (0.10 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> UPDATE graph_edges SET object_id = 1 WHERE object_id IS NULL;
Query OK, 22 rows affected (0.06 sec)
Rows matched: 22  Changed: 22  Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2);
Query OK, 2 rows affected (0.05 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1),(2,1),(2,1),(2,1),(2,1),(2,1),(2,1);
Query OK, 8 rows affected (0.05 sec)
Records: 8  Duplicates: 0  Warnings: 0

mysql> SELECT SLEEP(2);
+----------+
| SLEEP(2) |
+----------+
|        0 |
+----------+
1 row in set (2.00 sec)

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
Query OK, 4 rows affected (0.08 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
Query OK, 2 rows affected (0.05 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
Query OK, 4 rows affected (0.06 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (2,1),(2,1);
Query OK, 2 rows affected (0.05 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> INSERT INTO graph_edges (from_id,to_id) VALUES (1,2),(1,2),(1,2),(1,2);
Query OK, 4 rows affected (0.05 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> INSERT IN
SET @r=0;
SET @old_from=-1;
SET @old_to=-1;
SELECT from_id,to_id,
@inc:=((@old_from<>from_id)||(@old_to<>to_id)),
@old_from:=from_id,@old_to:=to_id,
@r:=(@r+@inc) as group_number
FROM graph_edges;
mysql> SET @r=0;
Query OK, 0 rows affected (0.00 sec)

mysql> SET @old_from=-1;
Query OK, 0 rows affected (0.00 sec)

mysql> SET @old_to=-1;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT from_id,to_id,
    -> @inc:=((@old_from<>from_id)||(@old_to<>to_id)),
    -> @old_from:=from_id,@old_to:=to_id,
    -> @r:=(@r+@inc) as group_number
    -> FROM graph_edges;
+---------+-------+------------------------------------------------+--------------------+----------------+--------------+
| from_id | to_id | @inc:=((@old_from<>from_id)||(@old_to<>to_id)) | @old_from:=from_id | @old_to:=to_id | group_number |
+---------+-------+------------------------------------------------+--------------------+----------------+--------------+
|       1 |     2 |                                              1 |                  1 |              2 |            1 |
|       1 |     2 |                                              0 |                  1 |              2 |            1 |
|       1 |     2 |                                              0 |                  1 |              2 |            1 |
|       1 |     2 |                                              0 |                  1 |              2 |            1 |
|       2 |     1 |                                              1 |                  2 |              1 |            2 |
|       2 |     1 |                                              0 |                  2 |              1 |            2 |
|       1 |     2 |                                              1 |                  1 |              2 |            3 |
|       1 |     2 |                                              0 |                  1 |              2 |            3 |
|       2 |     1 |                                              1 |                  2 |              1 |            4 |
|       2 |     1 |                                              0 |                  2 |              1 |            4 |
|       2 |     1 |                                              0 |                  2 |              1 |            4 |
|       2 |     1 |                                              0 |                  2 |              1 |            4 |
|       2 |     1 |                                              0 |                  2 |              1 |            4 |
|       2 |     1 |                                              0 |                  2 |              1 |            4 |
|       2 |     1 |                                              0 |                  2 |              1 |            4 |
|       2 |     1 |                                              0 |                  2 |              1 |            4 |
|       1 |     2 |                                              1 |                  1 |              2 |            5 |
|       1 |     2 |                                              0 |                  1 |              2 |            5 |
|       1 |     2 |                                              0 |   
SET @r=0;
SET @old_from=-1;
SET @old_to=-1;
SELECT group_number,from_id,to_id FROM
(
    SELECT from_id,to_id,
    @inc:=((@old_from<>from_id)||(@old_to<>to_id)),
    @old_from:=from_id,@old_to:=to_id,
    @r:=(@r+@inc) as group_number
    FROM graph_edges
) g GROUP BY group_number;

Context

StackExchange Database Administrators Q#29016, answer score: 5

Revisions (0)

No revisions yet.