HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlMinor

How to Pivot in PostgreSQL

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqlpivothow

Problem

Having the following data in a table:

ID     Category   Value
1234   Cat01      V001
1234   Cat02      V002
1234   Cat03      V003
1234   Cat03      V004
1234   Cat03      V005


I want to have the following output:

ID     Cat01  Cat02   Cat03
1234   V001   V002    V003
1234   V001   V002    V004
1234   V001   V002    V005


The output I want to achieve is a kind of pivot table where I have all the values vertically in a table and I want to have those values, horizontally, having the category as a column. But there are some categories that have multiples values, in that case, I need to repeat the values of all other categories and create a row per each repeated value

How can it be done in PostgreSQL?

Solution

This is a tricky one. crosstab() expects one (or no) value per category for each row_name.

We can work around this restriction like this:

SELECT id
     , COALESCE(cat01, max(cat01) OVER w)
     , COALESCE(cat02, max(cat02) OVER w)
     , COALESCE(cat03, max(cat03) OVER w)
FROM   crosstab(
   'SELECT id::text || row_number() OVER (PARTITION BY id, category ORDER BY value) * -1 AS ext_id
         , id, category, value
    FROM   tbl
    ORDER  BY ext_id, category, value'

   ,$VALUES ('Cat01'::text), ('Cat02'), ('Cat03')$
   ) AS ct (xid text, id int, cat01 text, cat02 text, cat03 text)
WINDOW w AS (PARTITION BY id);


Returns your desired result.
How?

-
Add an extended id: ext_id from the existing id and a row number for each value of the category for the same id. This way we ensure as many rows per id in as there are values for the most common category. We get a derived table like this to build our crosstab() on:
ext_id | id | category | value
---------+------+----------+-------
'1234-1' | 1234 | 'Cat01' | 'V001'
'1234-1' | 1234 | 'Cat02' | 'V002'
'1234-1' | 1234 | 'Cat03' | 'V003'
'1234-2' | 1234 | 'Cat03' | 'V004'
'1234-3' | 1234 | 'Cat03' | 'V005'


-
Now we can feed it to crosstab() using the safe 2-parameter form for missing attributes. Read the basics first if you are not familiar with this:

-
PostgreSQL Crosstab Query

The original id is carried over as "extra column". See:

-
Pivot on Multiple Columns using Tablefunc

-
Your question leaves room for interpretation. My solution pairs the lowest values per category first and keeps filling the following rows until there are no values left. (We could combine multiple values per category any other way, it has not been defined.) If a category is short of values for a given id, the rest is filled in with NULL values.

-
In the final step I replace those NULL values with the maximum value of each category per id:

COALESCE(cat01, max(cat01) OVER (PARTITION BY id, category))


which is effectively the same as:

max(cat01) OVER (PARTITION BY id, category)


I am hoping to make it slightly faster if we only default to the window function if the value is NULL.

Code Snippets

SELECT id
     , COALESCE(cat01, max(cat01) OVER w)
     , COALESCE(cat02, max(cat02) OVER w)
     , COALESCE(cat03, max(cat03) OVER w)
FROM   crosstab(
   'SELECT id::text || row_number() OVER (PARTITION BY id, category ORDER BY value) * -1 AS ext_id
         , id, category, value
    FROM   tbl
    ORDER  BY ext_id, category, value'

   ,$$VALUES ('Cat01'::text), ('Cat02'), ('Cat03')$$
   ) AS ct (xid text, id int, cat01 text, cat02 text, cat03 text)
WINDOW w AS (PARTITION BY id);
COALESCE(cat01, max(cat01) OVER (PARTITION BY id, category))
max(cat01) OVER (PARTITION BY id, category)

Context

StackExchange Database Administrators Q#135079, answer score: 6

Revisions (0)

No revisions yet.