snippetsqlMinor
How to transpose values from one column into columns with a value from a different column?
Viewed 0 times
columnscolumntransposeintowithvaluedifferentonehowvalues
Problem
I have a database with the following structure:
Date
role
type
duration
2022-04-16
Nurse
Food preparation
45
2022-04-17
Nurse
Cleaning
30
2022-04-17
Volunteer
Cleaning
20
2022-04-17
Nurse
Food preparation
60
Note: I don't know the values in the "type" column in advance, since they are defined by the user. Also, There can be multiple rows with overlapping date, role, and type.
I am using a charting library that would like for the data to be grouped as follows:
role
Food preparation
Cleaning
Nurse
105
30
Volunteer
Null
20
So far, I am able to group the data using the following query
role
type
total_mintes
Nurse
Cleaning
45
Nurse
Food preparation
20
Volunteer
Cleaning
15
Volunteer
Food preparation
43
How can I "pivot"/"transpose" the data so that each row represents a role with one column containing the sum of minutes for each type of work?
In effect, I would like to transpose the data similar to the Pandas DataFrame.pivot_table function, but using only SQL.
Date
role
type
duration
2022-04-16
Nurse
Food preparation
45
2022-04-17
Nurse
Cleaning
30
2022-04-17
Volunteer
Cleaning
20
2022-04-17
Nurse
Food preparation
60
Note: I don't know the values in the "type" column in advance, since they are defined by the user. Also, There can be multiple rows with overlapping date, role, and type.
I am using a charting library that would like for the data to be grouped as follows:
role
Food preparation
Cleaning
Nurse
105
30
Volunteer
Null
20
So far, I am able to group the data using the following query
select
role,
type,
sum(duration) as total_minutes
from work
group by role, type;
role
type
total_mintes
Nurse
Cleaning
45
Nurse
Food preparation
20
Volunteer
Cleaning
15
Volunteer
Food preparation
43
How can I "pivot"/"transpose" the data so that each row represents a role with one column containing the sum of minutes for each type of work?
In effect, I would like to transpose the data similar to the Pandas DataFrame.pivot_table function, but using only SQL.
Solution
First of all you will need to install the tablefunc extension using the
Even after reading this answer, it is recommended that you read the official documentation of PostgreSQL on crosstab here
As for how to do this:
Pay attention to the explicit
You will have to specify each possible output of the column
A more dynamic version of the above (although not perfect by any means):
This function would return the query you need to execute to get your desired result. It will dynamically build the list of possible columns you need for the output. This function can definitely be made to be more general purpose like it is done here but it's not a small amount of work to do that as PostgreSQL cannot return a set it does not know its definition beforehand.
There is the other option of this function instead of returning a query string, it can instead return an array of json objects each representing a row, and you would split this json into normal rows and column on application side. If such a solution is acceptable then this works fine:
The result of this function would be something similar to the following
create extension tablefunc; command, otherwise the pivot function crosstab will not work.Even after reading this answer, it is recommended that you read the official documentation of PostgreSQL on crosstab here
As for how to do this:
select *
from crosstab(
'select
role,
type,
sum(duration) as total_minutes
from work
group by role, type
order by type',
'select distinct type from work order by type'
) as ct(
role text,
"Cleaning" text,
"Food preparation" text
);Pay attention to the explicit
order by clause in both queries, this is a must, otherwise it may map values incorrectly as SQL does not guarantee the order of data without it.You will have to specify each possible output of the column
type in the alias.A more dynamic version of the above (although not perfect by any means):
create or replace function get_dynamic_transpose()
returns text
language plpgsql
as
$
declare
v_output_columns text;
begin
select array_to_string(array_agg(distinct quote_ident(type) || ' ' || pg_typeof(type) || E' \n'),',','null')
into v_output_columns
from testing;
return format(
'select *
from crosstab(
''select
role,
type,
sum(duration) as total_minutes
from testing
group by role, type
order by type'',
''select distinct type from testing order by type''
) as ct(
role text,
%s
);', v_output_columns
);
end;
$;This function would return the query you need to execute to get your desired result. It will dynamically build the list of possible columns you need for the output. This function can definitely be made to be more general purpose like it is done here but it's not a small amount of work to do that as PostgreSQL cannot return a set it does not know its definition beforehand.
There is the other option of this function instead of returning a query string, it can instead return an array of json objects each representing a row, and you would split this json into normal rows and column on application side. If such a solution is acceptable then this works fine:
create or replace function get_dynamic_transpose_jsonb()
returns jsonb
language plpgsql
as
$
declare
v_output_columns text;
v_query text;
v_result jsonb;
begin
select array_to_string(array_agg(distinct quote_ident(type) || ' ' || pg_typeof(type) || E' \n'),',','null')
into v_output_columns
from testing;
v_query = format(
'select jsonb_agg(ct)
from crosstab(
''select
role,
type,
sum(duration) as total_minutes
from testing
group by role, type
order by type'',
''select distinct type from testing order by type''
) as ct(
role text,
%s
);', v_output_columns
);
execute v_query into v_result;
return v_result;
end;
$;The result of this function would be something similar to the following
[{"role": "Nurse", "Cleaning": "30", "Food preparation": null}, {"role": "Volunteer", "Cleaning": null, "Food preparation": "55"}]Code Snippets
select *
from crosstab(
'select
role,
type,
sum(duration) as total_minutes
from work
group by role, type
order by type',
'select distinct type from work order by type'
) as ct(
role text,
"Cleaning" text,
"Food preparation" text
);create or replace function get_dynamic_transpose()
returns text
language plpgsql
as
$$
declare
v_output_columns text;
begin
select array_to_string(array_agg(distinct quote_ident(type) || ' ' || pg_typeof(type) || E' \n'),',','null')
into v_output_columns
from testing;
return format(
'select *
from crosstab(
''select
role,
type,
sum(duration) as total_minutes
from testing
group by role, type
order by type'',
''select distinct type from testing order by type''
) as ct(
role text,
%s
);', v_output_columns
);
end;
$$;create or replace function get_dynamic_transpose_jsonb()
returns jsonb
language plpgsql
as
$$
declare
v_output_columns text;
v_query text;
v_result jsonb;
begin
select array_to_string(array_agg(distinct quote_ident(type) || ' ' || pg_typeof(type) || E' \n'),',','null')
into v_output_columns
from testing;
v_query = format(
'select jsonb_agg(ct)
from crosstab(
''select
role,
type,
sum(duration) as total_minutes
from testing
group by role, type
order by type'',
''select distinct type from testing order by type''
) as ct(
role text,
%s
);', v_output_columns
);
execute v_query into v_result;
return v_result;
end;
$$;[{"role": "Nurse", "Cleaning": "30", "Food preparation": null}, {"role": "Volunteer", "Cleaning": null, "Food preparation": "55"}]Context
StackExchange Database Administrators Q#311053, answer score: 4
Revisions (0)
No revisions yet.