HiveBrain v1.2.0
Get Started
← Back to all entries
snippetMinor

Filter rows, but only select most recent row for each Id

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
rowsrecenteachbutfilterforselectrowonlymost

Problem

I have some code that is processing a lot of tasks. These tasks are either connected to a user or an account. Each Task go through many statuses -> Queued, Started, Error, Postponed, Queued, Started, Finished ...

Since there will be processed a lot of these Tasks very fast, the goal was to only do writes. As you can see from the comments, the other goal of saving these statuses is to be able to show the Task status in a user interface.

This was solved like so:

-- Used to store processing data for a Task
drop table if exists task;
create table task (
    id uuid,
    data text,
    type text,
    primary key (id)
);

-- Used to find all statuses for a specific task
-- Use case: View one Task i Admin interface
drop table if exists task_status;
create table task_status (
    task_id uuid,
    time_id timeuuid,
    status_enum text,
    status_text text,
    primary key ((task_id), time_id)
) with clustering order by (time_id desc);

-- Used to find Tasks for an Account
-- Use case: Show Account Details > Tasks
drop table if exists account_tasks;
create table account_tasks (
    account_id bigint,
    task_id uuid,
    time_id timeuuid,
    primary key ((account_id), time_id)
) with clustering order by (time_id desc);

-- Used to find Tasks for a User
-- Use case: Show User Details > Tasks
drop table if exists user_tasks;
create table user_tasks (
    user_id bigint,
    task_id uuid,
    time_id timeuuid,
    primary key ((user_id), time_id)
) with clustering order by (time_id desc);


When a Task is created, a row will be written to "task" and depending on which task is it, also to "user_tasks" or "account_tasks".
During processing of the Task multiple rows will be written to "task_status" with different statuses and a user friendly text describing what happened in the step.

I also need to be able to find all the Tasks that currently have a specific status, but also using a filter on type.

I have tried something like this:

```
-- Used to find tasks with a

Solution

One thing you can do is create another table which will store the current status of every task. So whenever you are updating your task_overview you will also update this new table keeping only the latest status. This will make table task_overview more or less like a log.

Context

StackExchange Database Administrators Q#109169, answer score: 2

Revisions (0)

No revisions yet.