HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Retrieving the most recent communication from a user

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
therecentusercommunicationretrievingfrommost

Problem

Could someone review an accepted answered I gave on Stack Overflow?

The use-case is as follows:


Given a messaging system where a user
can receive a message from a single
user and send messages to one or more
users, return the most recent
communication (sent or received)
between a passed userId and the
individual(s) that user communicated
with.

For the example, I have three tables:

Users

id user_name
1 Walker
2 John
3 Kate


Messages

id senderid body time
1 1 ignored 1 2010-04-01 00:00:00.000
2 1 ignored 2 2010-04-02 00:00:00.000
3 3 ignored 3 2010-04-03 00:00:00.000
4 1 msg A to john and kate 2010-04-10 00:00:00.000
5 3 msg b from kate to walker and john 2010-04-11 00:00:00.000


messages_recipients

id messageid userid
1 1 2
2 1 3
3 2 2
4 3 1
5 4 2
6 4 3
7 5 1
8 5 2


The data is tailored in such a way that I want a list of communications between user Walker and the people Walker has spoken with.

You can see a list of these messages by running the following SQL statement:

SELECT
u2.user_name AS Sender,
u1.user_name AS Receiver,
m.body,
m.time
FROM
messages m
JOIN
messages_recipients mr ON m.id = mr.messageid
JOIN
users u1 ON mr.userid = u1.id
JOIN
users u2 ON m.senderid = u2.id
ORDER BY
time DESC


Now that we have the test scenario, the part I want reviewed: returning the most recently communicated message between Walker, John, and Kate.

`BEGIN
DECLARE @UserId INT = 1

--A. Main Query
SELECT
CASE
WHEN mtemp.senderid = 1 --@UserId
THEN
CONCAT('Message To: ', receivers.user_name)
ELSE

Solution

My solution has a similar complexity to yours (14 steps in EXPLAIN), assuming MySQL's query optimizer is smart enough. However, in my opinion, this formulation will be much easier to understand.

SELECT  IF(recipientid,
           CONCAT('Message To: ', recipient.user_name),
           CONCAT('Message From: ', sender.user_name)) AS MessageType,
        body,
        time
    FROM
        ( -- Join messages with recipients, relabeling userids in terms of interlocutor and self
            SELECT messageid, time, body, NULL AS senderid, userid AS recipientid, userid AS interlocutor, senderid AS self
                FROM messages_recipients
                    INNER JOIN messages
                        ON messages.id = messageid
            UNION
            SELECT messages.id, time, body, senderid, NULL, senderid, userid
                FROM messages_recipients
                    INNER JOIN messages
                        ON messages.id = messageid
        ) AS thread_latest
        LEFT OUTER JOIN users AS recipient
            ON recipient.id = recipientid
        LEFT OUTER JOIN users AS sender
            ON sender.id = senderid
    WHERE
        -- Discard all but the latest message in each thread
        NOT EXISTS (
            SELECT messageid
                FROM
                (
                    SELECT messageid, time, userid AS interlocutor, senderid AS self
                        FROM messages_recipients
                            INNER JOIN messages
                                ON messages.id = messageid
                    UNION
                    SELECT messages.id, time, senderid, userid
                        FROM messages_recipients
                            INNER JOIN messages
                                ON messages.id = messageid
                ) AS thread_later
                WHERE
                    thread_later.self = thread_latest.self AND
                    thread_later.interlocutor = thread_latest.interlocutor AND
                    thread_later.time > thread_latest.time
            ) AND
        self = 1 --@UserId
    ORDER BY time DESC;


The main insight is that once you relabel senders and recipients in terms of interlocutor and self, it's just a simple matter of filtering out the results. Retain only those messages where self is the user in question. Then, every row that has the same interlocutor conceptually constitutes a thread.

Notice that there is a subquery that appears twice. We can make it clearer by creating a view.

CREATE VIEW threads AS
    -- Messages I sent
    SELECT messageid, time, body, NULL AS senderid, userid AS recipientid, userid AS interlocutor, senderid AS self
        FROM messages_recipients
            INNER JOIN messages
                ON messages.id = messageid
    UNION
    -- Messages I received
    SELECT messages.id, time, body, senderid, NULL, senderid, userid
        FROM messages_recipients
            INNER JOIN messages
                ON messages.id = messageid;

SELECT  IF(recipientid,
           CONCAT('Message To: ', recipient.user_name),
           CONCAT('Message From: ', sender.user_name)) AS MessageType,
        body,
        time
    FROM
        threads AS thread_latest
        LEFT OUTER JOIN users AS recipient
            ON recipient.id = recipientid
        LEFT OUTER JOIN users AS sender
            ON sender.id = senderid
    WHERE
        NOT EXISTS (
            SELECT messageid
                FROM threads AS thread_later
                WHERE
                    thread_later.self = thread_latest.self AND
                    thread_later.interlocutor = thread_latest.interlocutor AND
                    thread_later.time > thread_latest.time
            ) AND
        self = 1 --@UserId
    ORDER BY time DESC;


I'll take this opportunity to point out that this query is where PostgreSQL really shines. Two features in PostgreSQL (since version 8.4) make it easy. The WITH clause lets you define a helper view in the query itself. More importantly,
window functions let you partition the threads by interlocutor, which is precisely the tricky part about this problem.

```
WITH threads(messageid, time, body, senderid, recipientid, interlocutor, self) AS (
-- Messages I sent
SELECT messageid, time, body, NULL, userid, userid, senderid
FROM messages_recipients
INNER JOIN messages
ON messages.id = messageid
UNION
-- Messages I received
SELECT messages.id, time, body, senderid, NULL, senderid, userid
FROM messages_recipients
INNER JOIN messages
ON messages.id = messageid
)
SELECT CASE WHEN recipientid IS NOT NULL
THEN 'Message To: ' || recipient.user_name
ELSE 'Message From: ' || sender.user_name
END AS MessageType,
body,
time
FROM (
SELECT *,
RANK() OVER (PARTITION BY interlocutor ORDER BY time DESC) AS thread_pos

Code Snippets

SELECT  IF(recipientid,
           CONCAT('Message To: ', recipient.user_name),
           CONCAT('Message From: ', sender.user_name)) AS MessageType,
        body,
        time
    FROM
        ( -- Join messages with recipients, relabeling userids in terms of interlocutor and self
            SELECT messageid, time, body, NULL AS senderid, userid AS recipientid, userid AS interlocutor, senderid AS self
                FROM messages_recipients
                    INNER JOIN messages
                        ON messages.id = messageid
            UNION
            SELECT messages.id, time, body, senderid, NULL, senderid, userid
                FROM messages_recipients
                    INNER JOIN messages
                        ON messages.id = messageid
        ) AS thread_latest
        LEFT OUTER JOIN users AS recipient
            ON recipient.id = recipientid
        LEFT OUTER JOIN users AS sender
            ON sender.id = senderid
    WHERE
        -- Discard all but the latest message in each thread
        NOT EXISTS (
            SELECT messageid
                FROM
                (
                    SELECT messageid, time, userid AS interlocutor, senderid AS self
                        FROM messages_recipients
                            INNER JOIN messages
                                ON messages.id = messageid
                    UNION
                    SELECT messages.id, time, senderid, userid
                        FROM messages_recipients
                            INNER JOIN messages
                                ON messages.id = messageid
                ) AS thread_later
                WHERE
                    thread_later.self = thread_latest.self AND
                    thread_later.interlocutor = thread_latest.interlocutor AND
                    thread_later.time > thread_latest.time
            ) AND
        self = 1 --@UserId
    ORDER BY time DESC;
CREATE VIEW threads AS
    -- Messages I sent
    SELECT messageid, time, body, NULL AS senderid, userid AS recipientid, userid AS interlocutor, senderid AS self
        FROM messages_recipients
            INNER JOIN messages
                ON messages.id = messageid
    UNION
    -- Messages I received
    SELECT messages.id, time, body, senderid, NULL, senderid, userid
        FROM messages_recipients
            INNER JOIN messages
                ON messages.id = messageid;

SELECT  IF(recipientid,
           CONCAT('Message To: ', recipient.user_name),
           CONCAT('Message From: ', sender.user_name)) AS MessageType,
        body,
        time
    FROM
        threads AS thread_latest
        LEFT OUTER JOIN users AS recipient
            ON recipient.id = recipientid
        LEFT OUTER JOIN users AS sender
            ON sender.id = senderid
    WHERE
        NOT EXISTS (
            SELECT messageid
                FROM threads AS thread_later
                WHERE
                    thread_later.self = thread_latest.self AND
                    thread_later.interlocutor = thread_latest.interlocutor AND
                    thread_later.time > thread_latest.time
            ) AND
        self = 1 --@UserId
    ORDER BY time DESC;
WITH threads(messageid, time, body, senderid, recipientid, interlocutor, self) AS (
    -- Messages I sent
    SELECT messageid, time, body, NULL, userid, userid, senderid
        FROM messages_recipients
            INNER JOIN messages
                ON messages.id = messageid
    UNION
    -- Messages I received
    SELECT messages.id, time, body, senderid, NULL, senderid, userid
        FROM messages_recipients
            INNER JOIN messages
                ON messages.id = messageid
)
SELECT  CASE WHEN recipientid IS NOT NULL
           THEN 'Message To: ' || recipient.user_name
           ELSE 'Message From: ' || sender.user_name
        END AS MessageType,
        body,
        time
   FROM (
        SELECT  *,
                RANK() OVER (PARTITION BY interlocutor ORDER BY time DESC) AS thread_pos
            FROM threads
            WHERE self = 1 --@UserId
        ) AS my_threads
        LEFT OUTER JOIN users AS recipient
            ON recipient.id = recipientid
        LEFT OUTER JOIN users AS sender
            ON sender.id = senderid
    WHERE thread_pos = 1 -- Only the latest message per thread
    ORDER BY time DESC;

Context

StackExchange Code Review Q#1981, answer score: 6

Revisions (0)

No revisions yet.