patternsqlMinor
Pivot with 2+ columns (using CROSSTAB?)
Viewed 0 times
columnswithpivotcrosstabusing
Problem
I have a table
Sample output from this table looks like:
I use the following query to pivot the table if I exclude the column
The above query gives me:
But because the source of the deflator varies from year to year for each country I want to include the
deflator that is defined as:Table "deflator"
Column | Type | Modifiers
-------------+-------------------+-----------
country_code | smallint | not null
country_name | character varying | not null
year | smallint | not null
deflator | numeric |
source | character varying |Sample output from this table looks like:
country_code | country_name | year | deflator | source
-------------+---------------+------+----------+----------
1 | country_1 | 2016 | 12 | source_1
1 | country_1 | 2015 | 11 | source_2
1 | country_1 | 2014 | 10 | source_2
2 | country_2 | 2016 | 15 | source_1
2 | country_2 | 2015 | 14 | source_1
2 | country_2 | 2014 | 13 | source_2
3 | country_3 | 2016 | 18 | source_1
3 | country_3 | 2015 | 17 | source_2
3 | country_3 | 2014 | 16 | source_3
(9 rows)I use the following query to pivot the table if I exclude the column
source:SELECT
*
FROM CROSSTAB (
'SELECT
country_code
, country_name
, year
, deflator
FROM dimension.master_oecd_deflator
ORDER BY 1;'
, $ VALUES ('2014'::TEXT), ('2015'::TEXT), ('2016'::TEXT) $
) AS "ct" (
"country_code" SMALLINT
, "country_name" TEXT
, "2014" NUMERIC
, "2015" NUMERIC
, "2016" NUMERIC
);The above query gives me:
country_code | country_name | 2016 | 2015 | 2014 |
-------------+-------------------+------+--- --+------+
1 | country_1 | 12 | 11 | 10 |
2 | country_2 | 15 | 14 | 13 |
3 | country_3 | 18 | 17 | 16 |But because the source of the deflator varies from year to year for each country I want to include the
source column in the pivot for my desired output Solution
Saddam has a smart solution, but it carries some weaknesses. Imagine a source named 'Fresno, CA' (with comma in the string).
To avoid such corner case problems and preserve original data types, use a (well-defined!) row type instead. You can create a composite type permanently with
I also removed the unnecessary CTE and simplified a bit.
While dealing with only a hand full of years, you can do without
Using
Including
split_part() would be fooled by the separator character in the string ...To avoid such corner case problems and preserve original data types, use a (well-defined!) row type instead. You can create a composite type permanently with
CREATE TYPE or register a temporary one with CREATE TEMP TABLE:CREATE TEMP TABLE defso (def numeric, so varchar); -- once per session!
SELECT country_code
, country_name
, (d14).def AS deflator_2014 -- note the parentheses!
, (d14).so AS source_2014
, (d15).def AS deflator_2015
, (d15).so AS source_2015
, (d16).def AS deflator_2016
, (d16).so AS source_2016
FROM crosstab (
'SELECT country_code, country_name, year, (deflator, source)::defso
FROM deflator
ORDER BY 1'
, 'SELECT generate_series(2014, 2016)::int2'
) AS ct (country_code int2
, country_name text
, d14 defso
, d15 defso
, d16 defso
);I also removed the unnecessary CTE and simplified a bit.
While dealing with only a hand full of years, you can do without
crosstab() and use self-joins:SELECT country_code, country_name
, d14.deflator AS deflator_2014
, d14.source AS source_2014
, d15.deflator AS deflator_2015
, d15.source AS source_2015
, d16.deflator AS deflator_2016
, d16.source AS source_2016
FROM (SELECT * FROM deflator WHERE year = int2 '2014') d14
FULL JOIN (SELECT * FROM deflator WHERE year = int2 '2015') d15 USING (country_code, country_name)
FULL JOIN (SELECT * FROM deflator WHERE year = int2 '2016') d16 USING (country_code, country_name)
ORDER BY country_code;Using
FULL [OUTER] JOIN since we can't assume a row for every combination of (country_code, year). This way we get the same result as with the crosstab query above.Including
country_name in the join condition seems redundant, but if we don't, we have to use COALESCE(d14.country_name, d15.country_name, d16.country_name) AS country_name to defend against missing rows. This functionally dependent value shouldn't be in the table to begin with. Should be in a country table in a properly normalized schema.Code Snippets
CREATE TEMP TABLE defso (def numeric, so varchar); -- once per session!
SELECT country_code
, country_name
, (d14).def AS deflator_2014 -- note the parentheses!
, (d14).so AS source_2014
, (d15).def AS deflator_2015
, (d15).so AS source_2015
, (d16).def AS deflator_2016
, (d16).so AS source_2016
FROM crosstab (
'SELECT country_code, country_name, year, (deflator, source)::defso
FROM deflator
ORDER BY 1'
, 'SELECT generate_series(2014, 2016)::int2'
) AS ct (country_code int2
, country_name text
, d14 defso
, d15 defso
, d16 defso
);SELECT country_code, country_name
, d14.deflator AS deflator_2014
, d14.source AS source_2014
, d15.deflator AS deflator_2015
, d15.source AS source_2015
, d16.deflator AS deflator_2016
, d16.source AS source_2016
FROM (SELECT * FROM deflator WHERE year = int2 '2014') d14
FULL JOIN (SELECT * FROM deflator WHERE year = int2 '2015') d15 USING (country_code, country_name)
FULL JOIN (SELECT * FROM deflator WHERE year = int2 '2016') d16 USING (country_code, country_name)
ORDER BY country_code;Context
StackExchange Database Administrators Q#158181, answer score: 5
Revisions (0)
No revisions yet.