HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

QUERY - pivot multiple columns, variable number of rows

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
rowsnumbercolumnspivotquerymultiplevariable

Problem

I have a table that looks like this:

RECIPE     VERSION_ID     INGREDIENT    PERCENTAGE
4000       100            Ing_1          23,0
4000       100            Ing_100         0,1
4000       200            Ing_1          20,0
4000       200            Ing_100         0,7
4000       300            Ing_1          22,3
4000       300            Ing_100         0,9
4001       900            Ing_1           8,3
4001       900            Ing_100        72,4
4001       901            Ing_1           9,3
4001       901            Ing_100        70,5
5012       871            Ing_1          45,1
5012       871            Ing_100         0,9
5012       877            Ing_1          47,2
5012       877            Ing_100         0,8
5012       879            Ing_1          46,6
5012       879            Ing_100         0,9
5012       880            Ing_1          43,6
5012       880            Ing_100         1,2


There are 100 ingredients per recipe/version. I'd like to display the data from this table like this:

RECIPE     INGREDIENT_Vxxx     PERCENTAGE_Vxxx     INGREDIENT_Vyyy     INGREDIENT_Vyyy (ETC)
4000       Ing_1               23,0                Ing_1               20,0
4000       Ing_100             0,1                 Ing_100              0,7


Because in different versions of recipes, ingredients could be removed or added, I'd like to display both ingredient and percentage per version per recipe. There's also the difficulty that different recipes have different number of versions.

I'm not even sure if this is possible at all or where to begin. Maybe with the PIVOT function?

Could anyone please point me in the right direction?

Solution

The problem here seems to me to be largely more of a scoping issue - you are likely having difficulty solving this problem on account of the requirements not being defined well enough. With the description and sample data provided, there are at least three partial solutions, none of which may be applicable to your particular use cases. With the test data set up as follows,

IF NOT EXISTS ( SELECT  1
                FROM    sys.objects
                WHERE   name = 'Recipe'
                    AND type = 'U' )
BEGIN
    --DROP TABLE dbo.Recipe;
    CREATE TABLE dbo.Recipe
    (
        Recipe          INTEGER NOT NULL,
        VersionID       INTEGER NOT NULL,
        Ingredient      VARCHAR( 8 ) NOT NULL,
        Percentage      DECIMAL( 5, 2 )
    );

    INSERT INTO dbo.Recipe ( Recipe, VersionID, Ingredient, Percentage )
                SELECT  4000, 100, 'Ing_1', 23.0
    UNION ALL   SELECT  4000, 100, 'Ing_100', 0.1
    UNION ALL   SELECT  4000, 200, 'Ing_1', 20.0
    UNION ALL   SELECT  4000, 200, 'Ing_100', 0.7
    UNION ALL   SELECT  4000, 300, 'Ing_1', 22.3
    UNION ALL   SELECT  4000, 300, 'Ing_100', 0.9
    UNION ALL   SELECT  4001, 900, 'Ing_1', 8.3
    UNION ALL   SELECT  4001, 900, 'Ing_100', 72.4
    UNION ALL   SELECT  4001, 901, 'Ing_1', 9.3
    UNION ALL   SELECT  4001, 901, 'Ing_100', 70.5
    UNION ALL   SELECT  5012, 871, 'Ing_1', 45.1
    UNION ALL   SELECT  5012, 871, 'Ing_100', 0.9
    UNION ALL   SELECT  5012, 877, 'Ing_1', 47.2
    UNION ALL   SELECT  5012, 877, 'Ing_100', 0.8
    UNION ALL   SELECT  5012, 879, 'Ing_1', 46.6
    UNION ALL   SELECT  5012, 879, 'Ing_100', 0.9
    UNION ALL   SELECT  5012, 880, 'Ing_1', 43.6
    UNION ALL   SELECT  5012, 880, 'Ing_100', 1.2;

    ALTER TABLE dbo.Recipe
    ADD CONSTRAINT PK__Recipe
        PRIMARY KEY CLUSTERED ( Recipe, VersionID, Ingredient );

    CREATE NONCLUSTERED INDEX IX__Recipe__Recipe__VersionID
        ON  dbo.Recipe ( Recipe, VersionID )
    INCLUDE ( Percentage );
END;
GO


we can use our new table to explore some possible solutions. Expanding on the sample output, we add the next recipe into the result set to illustrate the difficulty with the question.

RECIPE --- INGREDIENT_V100 --- PERCENTAGE_V100 --- INGREDIENT_V200 --- INGREDIENT_V200 
4000       Ing_1               23,0                Ing_1               20,0
4000       Ing_100              0,1                Ing_100              0,7
4001       Ing_1                8,3                Ing_1                9,3
4001       Ing_100             72,4                Ing_100             70,5


Columns %_V100 and %_V200 make perfect sense in the case of the 4000 recipe, but quickly lose their meaning as additional recipes are added. The 4001 recipe would need new and separate columns to properly label the data by version, but since the version numbers differ for each and every recipe, that path leads us to a very sparse result set which would be downright annoying to use, or we must alias the columns, losing the version number data.

Solution 1:

Starting with what I feel is the absolute worst way of going about this, let's look at the sparse result set. For the sample data, we would be attempting to generate a query which would look like something along the following lines:

SELECT  p.Recipe,
        [Ingredient_v100] = CASE WHEN p.[100] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v100] = p.[100], 
        [Ingredient_v200] = CASE WHEN p.[200] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v200] = p.[200], 
        [Ingredient_v300] = CASE WHEN p.[300] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v300] = p.[300], 
        [Ingredient_v871] = CASE WHEN p.[871] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v871] = p.[871], 
        [Ingredient_v877] = CASE WHEN p.[877] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v877] = p.[877], 
        [Ingredient_v879] = CASE WHEN p.[879] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v879] = p.[879], 
        [Ingredient_v880] = CASE WHEN p.[880] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v880] = p.[880], 
        [Ingredient_v900] = CASE WHEN p.[900] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v900] = p.[900], 
        [Ingredient_v901] = CASE WHEN p.[901] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v901] = p.[901]
FROM (  SELECT  r.Recipe, 
                r.VersionID, 
                r.Ingredient,
                r.Percentage 
        FROM    dbo.Recipe r ) s
PIVOT ( MAX( s.Percentage )
        FOR s.VersionID IN ( [100], [200], [300], [871], [877], [879], [880], [900], [901] ) ) p
ORDER BY p.Recipe;


On account of the variable number of versions, we can use some dynamic SQL to generate and execute the query.

```
DECLARE @Piv NVARCHAR( MAX ),
@Col NVARCHAR( MAX ),
@SQL NVARCHAR( MAX );

SELECT @Piv = LEFT( b.Piv, LEN( b.Piv ) - 1 )
FROM ( SELECT N'[' + C

Code Snippets

IF NOT EXISTS ( SELECT  1
                FROM    sys.objects
                WHERE   name = 'Recipe'
                    AND type = 'U' )
BEGIN
    --DROP TABLE dbo.Recipe;
    CREATE TABLE dbo.Recipe
    (
        Recipe          INTEGER NOT NULL,
        VersionID       INTEGER NOT NULL,
        Ingredient      VARCHAR( 8 ) NOT NULL,
        Percentage      DECIMAL( 5, 2 )
    );

    INSERT INTO dbo.Recipe ( Recipe, VersionID, Ingredient, Percentage )
                SELECT  4000, 100, 'Ing_1', 23.0
    UNION ALL   SELECT  4000, 100, 'Ing_100', 0.1
    UNION ALL   SELECT  4000, 200, 'Ing_1', 20.0
    UNION ALL   SELECT  4000, 200, 'Ing_100', 0.7
    UNION ALL   SELECT  4000, 300, 'Ing_1', 22.3
    UNION ALL   SELECT  4000, 300, 'Ing_100', 0.9
    UNION ALL   SELECT  4001, 900, 'Ing_1', 8.3
    UNION ALL   SELECT  4001, 900, 'Ing_100', 72.4
    UNION ALL   SELECT  4001, 901, 'Ing_1', 9.3
    UNION ALL   SELECT  4001, 901, 'Ing_100', 70.5
    UNION ALL   SELECT  5012, 871, 'Ing_1', 45.1
    UNION ALL   SELECT  5012, 871, 'Ing_100', 0.9
    UNION ALL   SELECT  5012, 877, 'Ing_1', 47.2
    UNION ALL   SELECT  5012, 877, 'Ing_100', 0.8
    UNION ALL   SELECT  5012, 879, 'Ing_1', 46.6
    UNION ALL   SELECT  5012, 879, 'Ing_100', 0.9
    UNION ALL   SELECT  5012, 880, 'Ing_1', 43.6
    UNION ALL   SELECT  5012, 880, 'Ing_100', 1.2;

    ALTER TABLE dbo.Recipe
    ADD CONSTRAINT PK__Recipe
        PRIMARY KEY CLUSTERED ( Recipe, VersionID, Ingredient );

    CREATE NONCLUSTERED INDEX IX__Recipe__Recipe__VersionID
        ON  dbo.Recipe ( Recipe, VersionID )
    INCLUDE ( Percentage );
END;
GO
RECIPE --- INGREDIENT_V100 --- PERCENTAGE_V100 --- INGREDIENT_V200 --- INGREDIENT_V200 
4000       Ing_1               23,0                Ing_1               20,0
4000       Ing_100              0,1                Ing_100              0,7
4001       Ing_1                8,3                Ing_1                9,3
4001       Ing_100             72,4                Ing_100             70,5
SELECT  p.Recipe,
        [Ingredient_v100] = CASE WHEN p.[100] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v100] = p.[100], 
        [Ingredient_v200] = CASE WHEN p.[200] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v200] = p.[200], 
        [Ingredient_v300] = CASE WHEN p.[300] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v300] = p.[300], 
        [Ingredient_v871] = CASE WHEN p.[871] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v871] = p.[871], 
        [Ingredient_v877] = CASE WHEN p.[877] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v877] = p.[877], 
        [Ingredient_v879] = CASE WHEN p.[879] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v879] = p.[879], 
        [Ingredient_v880] = CASE WHEN p.[880] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v880] = p.[880], 
        [Ingredient_v900] = CASE WHEN p.[900] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v900] = p.[900], 
        [Ingredient_v901] = CASE WHEN p.[901] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v901] = p.[901]
FROM (  SELECT  r.Recipe, 
                r.VersionID, 
                r.Ingredient,
                r.Percentage 
        FROM    dbo.Recipe r ) s
PIVOT ( MAX( s.Percentage )
        FOR s.VersionID IN ( [100], [200], [300], [871], [877], [879], [880], [900], [901] ) ) p
ORDER BY p.Recipe;
DECLARE @Piv            NVARCHAR( MAX ),
        @Col            NVARCHAR( MAX ),
        @SQL            NVARCHAR( MAX );

SELECT  @Piv = LEFT( b.Piv, LEN( b.Piv ) - 1 )
FROM (  SELECT  N'[' + CONVERT( VARCHAR( 8 ), a.VersionID ) + '], '
        FROM (  SELECT  DISTINCT r.VersionID 
                FROM    dbo.Recipe r ) a
        ORDER BY a.VersionID
        FOR XML PATH ( '' ) ) b ( Piv );

SELECT  @Col = LEFT( b.Piv, LEN( b.Piv ) - 1 )
FROM (  SELECT  N'[Ingredient_v' + CONVERT( VARCHAR( 8 ), a.VersionID ) + '] = CASE'
                    + ' WHEN p.[' + CONVERT( VARCHAR( 8 ), a.VersionID ) + '] IS NULL THEN NULL'
                    + ' ELSE p.[Ingredient] END, ' 
                    + '[Percentage_v' + CONVERT( VARCHAR( 8 ), a.VersionID ) + '] = p.[' 
                    + CONVERT( VARCHAR( 8 ), a.VersionID ) + '], ' 
        FROM (  SELECT  DISTINCT r.VersionID 
                FROM    dbo.Recipe r ) a
        ORDER BY a.VersionID
        FOR XML PATH ( '' ) ) b ( Piv );

SET @SQL = N'
        SELECT  p.Recipe, ' + @Col + '
        FROM (  SELECT  r.Recipe, 
                        r.VersionID, 
                        r.Ingredient,
                        r.Percentage 
                FROM    dbo.Recipe r ) s
        PIVOT ( MAX( s.Percentage )
                FOR s.VersionID IN ( ' + @Piv + ' ) ) p
        ORDER BY p.Recipe;';
EXECUTE dbo.sp_executesql @statement = @SQL;
GO
SELECT  p.Recipe, 
        [Ingredient_vA] = p.[Ingredient], [Percentage_vA] = ISNULL( p.[Percentage_vA], 0 ),
        [Ingredient_vB] = p.[Ingredient], [Percentage_vB] = ISNULL( p.[Percentage_vB], 0 ),
        [Ingredient_vC] = p.[Ingredient], [Percentage_vC] = ISNULL( p.[Percentage_vC], 0 ),
        [Ingredient_vD] = p.[Ingredient], [Percentage_vD] = ISNULL( p.[Percentage_vD], 0 )
FROM (  SELECT  Lvl = 'Percentage_v' + CHAR( 64 + 
                    DENSE_RANK() OVER ( 
                        PARTITION BY r.Recipe
                        ORDER BY r.VersionID ) ), 
                r.Recipe, 
                r.Ingredient,
                r.Percentage 
        FROM    dbo.Recipe r ) s
PIVOT ( MAX( s.Percentage )
        FOR s.Lvl IN ( [Percentage_vA], [Percentage_vB], [Percentage_vC], [Percentage_vD] ) ) p
ORDER BY p.Recipe;

Context

StackExchange Database Administrators Q#65786, answer score: 8

Revisions (0)

No revisions yet.