patternsqlMinor
SUM of ever increasing column
Viewed 0 times
evercolumnincreasingsum
Problem
I have a table of "transactions" where each transaction has an amount:
http://sqlfiddle.com/#!15/42849/1
Records in the table are never REMOVE'ed or UPDATE'ed. Only new transactions are added.
I wish to calculate the SUM of the amounts. The calculation doesn't have to be 100 percent up to date for every request.
On a dataset of about a million rows this takes about 400 ms on my database. This is way too slow for my application and I am trying to find the best solution for speeding this up.
What I have tried so far
Question
Does PostgreSQL provide a solution for speeding this type of query up?
Update 1
The SUM query is just a basic sum on a single column so I don't believe that this query in itself can get any faster. The solution is probably to do some kind of caching/precalculation or similar. Does PostgreSQL have any features in this regard?
Update 2
Table in question:
Query in question:
Update 3
I found that I actually need a "type" as well.
Updated table:
Updated query:
SQL Fiddle:
http://sqlfiddle.com/#!15/77e67/2
http://sqlfiddle.com/#!15/42849/1
Records in the table are never REMOVE'ed or UPDATE'ed. Only new transactions are added.
I wish to calculate the SUM of the amounts. The calculation doesn't have to be 100 percent up to date for every request.
On a dataset of about a million rows this takes about 400 ms on my database. This is way too slow for my application and I am trying to find the best solution for speeding this up.
What I have tried so far
- Materialized view: Adds complexity of having to have a cronjob running which updates the view every X seconds.
- Caching on the application server: Every X request will be slow when the cache needs an update.
- Storing results of queries on an old subset: Store the SUM of previous request and use these to calculate the correct total. Adds complexity.
Question
Does PostgreSQL provide a solution for speeding this type of query up?
Update 1
The SUM query is just a basic sum on a single column so I don't believe that this query in itself can get any faster. The solution is probably to do some kind of caching/precalculation or similar. Does PostgreSQL have any features in this regard?
Update 2
Table in question:
CREATE TABLE transactions
(
id bigserial NOT NULL,
amount bigint NOT NULL
);Query in question:
SELECT SUM(amount) FROM transactions;Update 3
I found that I actually need a "type" as well.
Updated table:
CREATE TABLE transactions
(
id bigserial NOT NULL,
amount bigint NOT NULL,
type int NOT NULL
);Updated query:
SELECT SUM(amount) FROM transactions GROUP BY type;SQL Fiddle:
http://sqlfiddle.com/#!15/77e67/2
Solution
Here's one idea that you can evaluate:
The current amount should be something like:
On a regular basis you can refresh last_transaction similar to:
The version of PostgreSQL in your fiddle does not support (perhaps no version does?)
Just an idea, that may or may not fit your needs.
Edit: added type
If a type is to be included (consider naming it transaction_type or something similar) we can extend last_transaction:
To get the current_amount we need to add type to the
To do a full refresh (according to @Andriy M suggestion) of last_transaction:
I have yet to examine @YperSillyCubeᵀᴹ suggestion.
I added about a million rows to the transaction table and what I believe would be relevant indexes, but the plan in sqlfiddle looks kind of disappointing.
CREATE TABLE last_transaction
( last_id bigserial NOT NULL
, cumulative_amount bigint NOT NULL
);
INSERT INTO last_transaction (last_id, cumulative_amount) VALUES (-1,0);The current amount should be something like:
SELECT coalesce(SUM(t.amount),0) + coalesce(lt.cumulative_amount,0)
FROM transactions t
RIGHT JOIN last_transaction lt
ON t.id > lt.last_id
GROUP BY lt.cumulative_amount;On a regular basis you can refresh last_transaction similar to:
update last_transaction
set last_id = (select max(id) from transactions)
, cumulative_amount = (select sum(amount) from transactions);The version of PostgreSQL in your fiddle does not support (perhaps no version does?)
set (last_id, cumulative_amount) = (select ...)Just an idea, that may or may not fit your needs.
Edit: added type
If a type is to be included (consider naming it transaction_type or something similar) we can extend last_transaction:
CREATE TABLE last_transaction
( type int not null
, last_id bigserial NOT NULL
, cumulative_amount bigint NOT NULL
, constraint pk_last_transaction primary key (type)
);
INSERT INTO last_transaction (type, last_id, cumulative_amount)
SELECT distinct type, -1, 0
FROM transactions;To get the current_amount we need to add type to the
GROUP BY clause as well as to the ON clause.SELECT lt.type
, coalesce(SUM(t.amount),0) + coalesce(lt.cumulative_amount,0)
FROM transactions t
RIGHT JOIN last_transaction lt
ON t.id > lt.last_id
AND t.type = lt.type
GROUP BY lt.type, lt.cumulative_amount;To do a full refresh (according to @Andriy M suggestion) of last_transaction:
UPDATE last_transaction AS lt
SET last_id = t.last_id
, cumulative_amount = t.cumulative_amount
FROM (
SELECT TYPE
, MAX(id)
, SUM(amount)
FROM transactions
GROUP BY TYPE
) AS t (type, last_id, cumulative_amount)
WHERE t.type = lt.type;I have yet to examine @YperSillyCubeᵀᴹ suggestion.
I added about a million rows to the transaction table and what I believe would be relevant indexes, but the plan in sqlfiddle looks kind of disappointing.
Code Snippets
CREATE TABLE last_transaction
( last_id bigserial NOT NULL
, cumulative_amount bigint NOT NULL
);
INSERT INTO last_transaction (last_id, cumulative_amount) VALUES (-1,0);SELECT coalesce(SUM(t.amount),0) + coalesce(lt.cumulative_amount,0)
FROM transactions t
RIGHT JOIN last_transaction lt
ON t.id > lt.last_id
GROUP BY lt.cumulative_amount;update last_transaction
set last_id = (select max(id) from transactions)
, cumulative_amount = (select sum(amount) from transactions);set (last_id, cumulative_amount) = (select ...)CREATE TABLE last_transaction
( type int not null
, last_id bigserial NOT NULL
, cumulative_amount bigint NOT NULL
, constraint pk_last_transaction primary key (type)
);
INSERT INTO last_transaction (type, last_id, cumulative_amount)
SELECT distinct type, -1, 0
FROM transactions;Context
StackExchange Database Administrators Q#124227, answer score: 2
Revisions (0)
No revisions yet.