patternsqlMinor
QUERY - pivot multiple columns, variable number of rows
Viewed 0 times
rowsnumbercolumnspivotquerymultiplevariable
Problem
I have a table that looks like this:
There are 100 ingredients per recipe/version. I'd like to display the data from this table like this:
Because in different versions of recipes, ingredients could be removed or added, I'd like to display both ingredient and percentage per version per recipe. There's also the difficulty that different recipes have different number of versions.
I'm not even sure if this is possible at all or where to begin. Maybe with the
Could anyone please point me in the right direction?
RECIPE VERSION_ID INGREDIENT PERCENTAGE
4000 100 Ing_1 23,0
4000 100 Ing_100 0,1
4000 200 Ing_1 20,0
4000 200 Ing_100 0,7
4000 300 Ing_1 22,3
4000 300 Ing_100 0,9
4001 900 Ing_1 8,3
4001 900 Ing_100 72,4
4001 901 Ing_1 9,3
4001 901 Ing_100 70,5
5012 871 Ing_1 45,1
5012 871 Ing_100 0,9
5012 877 Ing_1 47,2
5012 877 Ing_100 0,8
5012 879 Ing_1 46,6
5012 879 Ing_100 0,9
5012 880 Ing_1 43,6
5012 880 Ing_100 1,2There are 100 ingredients per recipe/version. I'd like to display the data from this table like this:
RECIPE INGREDIENT_Vxxx PERCENTAGE_Vxxx INGREDIENT_Vyyy INGREDIENT_Vyyy (ETC)
4000 Ing_1 23,0 Ing_1 20,0
4000 Ing_100 0,1 Ing_100 0,7Because in different versions of recipes, ingredients could be removed or added, I'd like to display both ingredient and percentage per version per recipe. There's also the difficulty that different recipes have different number of versions.
I'm not even sure if this is possible at all or where to begin. Maybe with the
PIVOT function?Could anyone please point me in the right direction?
Solution
The problem here seems to me to be largely more of a scoping issue - you are likely having difficulty solving this problem on account of the requirements not being defined well enough. With the description and sample data provided, there are at least three partial solutions, none of which may be applicable to your particular use cases. With the test data set up as follows,
we can use our new table to explore some possible solutions. Expanding on the sample output, we add the next recipe into the result set to illustrate the difficulty with the question.
Columns
Solution 1:
Starting with what I feel is the absolute worst way of going about this, let's look at the sparse result set. For the sample data, we would be attempting to generate a query which would look like something along the following lines:
On account of the variable number of versions, we can use some dynamic SQL to generate and execute the query.
```
DECLARE @Piv NVARCHAR( MAX ),
@Col NVARCHAR( MAX ),
@SQL NVARCHAR( MAX );
SELECT @Piv = LEFT( b.Piv, LEN( b.Piv ) - 1 )
FROM ( SELECT N'[' + C
IF NOT EXISTS ( SELECT 1
FROM sys.objects
WHERE name = 'Recipe'
AND type = 'U' )
BEGIN
--DROP TABLE dbo.Recipe;
CREATE TABLE dbo.Recipe
(
Recipe INTEGER NOT NULL,
VersionID INTEGER NOT NULL,
Ingredient VARCHAR( 8 ) NOT NULL,
Percentage DECIMAL( 5, 2 )
);
INSERT INTO dbo.Recipe ( Recipe, VersionID, Ingredient, Percentage )
SELECT 4000, 100, 'Ing_1', 23.0
UNION ALL SELECT 4000, 100, 'Ing_100', 0.1
UNION ALL SELECT 4000, 200, 'Ing_1', 20.0
UNION ALL SELECT 4000, 200, 'Ing_100', 0.7
UNION ALL SELECT 4000, 300, 'Ing_1', 22.3
UNION ALL SELECT 4000, 300, 'Ing_100', 0.9
UNION ALL SELECT 4001, 900, 'Ing_1', 8.3
UNION ALL SELECT 4001, 900, 'Ing_100', 72.4
UNION ALL SELECT 4001, 901, 'Ing_1', 9.3
UNION ALL SELECT 4001, 901, 'Ing_100', 70.5
UNION ALL SELECT 5012, 871, 'Ing_1', 45.1
UNION ALL SELECT 5012, 871, 'Ing_100', 0.9
UNION ALL SELECT 5012, 877, 'Ing_1', 47.2
UNION ALL SELECT 5012, 877, 'Ing_100', 0.8
UNION ALL SELECT 5012, 879, 'Ing_1', 46.6
UNION ALL SELECT 5012, 879, 'Ing_100', 0.9
UNION ALL SELECT 5012, 880, 'Ing_1', 43.6
UNION ALL SELECT 5012, 880, 'Ing_100', 1.2;
ALTER TABLE dbo.Recipe
ADD CONSTRAINT PK__Recipe
PRIMARY KEY CLUSTERED ( Recipe, VersionID, Ingredient );
CREATE NONCLUSTERED INDEX IX__Recipe__Recipe__VersionID
ON dbo.Recipe ( Recipe, VersionID )
INCLUDE ( Percentage );
END;
GOwe can use our new table to explore some possible solutions. Expanding on the sample output, we add the next recipe into the result set to illustrate the difficulty with the question.
RECIPE --- INGREDIENT_V100 --- PERCENTAGE_V100 --- INGREDIENT_V200 --- INGREDIENT_V200
4000 Ing_1 23,0 Ing_1 20,0
4000 Ing_100 0,1 Ing_100 0,7
4001 Ing_1 8,3 Ing_1 9,3
4001 Ing_100 72,4 Ing_100 70,5Columns
%_V100 and %_V200 make perfect sense in the case of the 4000 recipe, but quickly lose their meaning as additional recipes are added. The 4001 recipe would need new and separate columns to properly label the data by version, but since the version numbers differ for each and every recipe, that path leads us to a very sparse result set which would be downright annoying to use, or we must alias the columns, losing the version number data.Solution 1:
Starting with what I feel is the absolute worst way of going about this, let's look at the sparse result set. For the sample data, we would be attempting to generate a query which would look like something along the following lines:
SELECT p.Recipe,
[Ingredient_v100] = CASE WHEN p.[100] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v100] = p.[100],
[Ingredient_v200] = CASE WHEN p.[200] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v200] = p.[200],
[Ingredient_v300] = CASE WHEN p.[300] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v300] = p.[300],
[Ingredient_v871] = CASE WHEN p.[871] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v871] = p.[871],
[Ingredient_v877] = CASE WHEN p.[877] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v877] = p.[877],
[Ingredient_v879] = CASE WHEN p.[879] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v879] = p.[879],
[Ingredient_v880] = CASE WHEN p.[880] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v880] = p.[880],
[Ingredient_v900] = CASE WHEN p.[900] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v900] = p.[900],
[Ingredient_v901] = CASE WHEN p.[901] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v901] = p.[901]
FROM ( SELECT r.Recipe,
r.VersionID,
r.Ingredient,
r.Percentage
FROM dbo.Recipe r ) s
PIVOT ( MAX( s.Percentage )
FOR s.VersionID IN ( [100], [200], [300], [871], [877], [879], [880], [900], [901] ) ) p
ORDER BY p.Recipe;On account of the variable number of versions, we can use some dynamic SQL to generate and execute the query.
```
DECLARE @Piv NVARCHAR( MAX ),
@Col NVARCHAR( MAX ),
@SQL NVARCHAR( MAX );
SELECT @Piv = LEFT( b.Piv, LEN( b.Piv ) - 1 )
FROM ( SELECT N'[' + C
Code Snippets
IF NOT EXISTS ( SELECT 1
FROM sys.objects
WHERE name = 'Recipe'
AND type = 'U' )
BEGIN
--DROP TABLE dbo.Recipe;
CREATE TABLE dbo.Recipe
(
Recipe INTEGER NOT NULL,
VersionID INTEGER NOT NULL,
Ingredient VARCHAR( 8 ) NOT NULL,
Percentage DECIMAL( 5, 2 )
);
INSERT INTO dbo.Recipe ( Recipe, VersionID, Ingredient, Percentage )
SELECT 4000, 100, 'Ing_1', 23.0
UNION ALL SELECT 4000, 100, 'Ing_100', 0.1
UNION ALL SELECT 4000, 200, 'Ing_1', 20.0
UNION ALL SELECT 4000, 200, 'Ing_100', 0.7
UNION ALL SELECT 4000, 300, 'Ing_1', 22.3
UNION ALL SELECT 4000, 300, 'Ing_100', 0.9
UNION ALL SELECT 4001, 900, 'Ing_1', 8.3
UNION ALL SELECT 4001, 900, 'Ing_100', 72.4
UNION ALL SELECT 4001, 901, 'Ing_1', 9.3
UNION ALL SELECT 4001, 901, 'Ing_100', 70.5
UNION ALL SELECT 5012, 871, 'Ing_1', 45.1
UNION ALL SELECT 5012, 871, 'Ing_100', 0.9
UNION ALL SELECT 5012, 877, 'Ing_1', 47.2
UNION ALL SELECT 5012, 877, 'Ing_100', 0.8
UNION ALL SELECT 5012, 879, 'Ing_1', 46.6
UNION ALL SELECT 5012, 879, 'Ing_100', 0.9
UNION ALL SELECT 5012, 880, 'Ing_1', 43.6
UNION ALL SELECT 5012, 880, 'Ing_100', 1.2;
ALTER TABLE dbo.Recipe
ADD CONSTRAINT PK__Recipe
PRIMARY KEY CLUSTERED ( Recipe, VersionID, Ingredient );
CREATE NONCLUSTERED INDEX IX__Recipe__Recipe__VersionID
ON dbo.Recipe ( Recipe, VersionID )
INCLUDE ( Percentage );
END;
GORECIPE --- INGREDIENT_V100 --- PERCENTAGE_V100 --- INGREDIENT_V200 --- INGREDIENT_V200
4000 Ing_1 23,0 Ing_1 20,0
4000 Ing_100 0,1 Ing_100 0,7
4001 Ing_1 8,3 Ing_1 9,3
4001 Ing_100 72,4 Ing_100 70,5SELECT p.Recipe,
[Ingredient_v100] = CASE WHEN p.[100] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v100] = p.[100],
[Ingredient_v200] = CASE WHEN p.[200] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v200] = p.[200],
[Ingredient_v300] = CASE WHEN p.[300] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v300] = p.[300],
[Ingredient_v871] = CASE WHEN p.[871] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v871] = p.[871],
[Ingredient_v877] = CASE WHEN p.[877] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v877] = p.[877],
[Ingredient_v879] = CASE WHEN p.[879] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v879] = p.[879],
[Ingredient_v880] = CASE WHEN p.[880] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v880] = p.[880],
[Ingredient_v900] = CASE WHEN p.[900] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v900] = p.[900],
[Ingredient_v901] = CASE WHEN p.[901] IS NULL THEN NULL ELSE p.[Ingredient] END, [Percentage_v901] = p.[901]
FROM ( SELECT r.Recipe,
r.VersionID,
r.Ingredient,
r.Percentage
FROM dbo.Recipe r ) s
PIVOT ( MAX( s.Percentage )
FOR s.VersionID IN ( [100], [200], [300], [871], [877], [879], [880], [900], [901] ) ) p
ORDER BY p.Recipe;DECLARE @Piv NVARCHAR( MAX ),
@Col NVARCHAR( MAX ),
@SQL NVARCHAR( MAX );
SELECT @Piv = LEFT( b.Piv, LEN( b.Piv ) - 1 )
FROM ( SELECT N'[' + CONVERT( VARCHAR( 8 ), a.VersionID ) + '], '
FROM ( SELECT DISTINCT r.VersionID
FROM dbo.Recipe r ) a
ORDER BY a.VersionID
FOR XML PATH ( '' ) ) b ( Piv );
SELECT @Col = LEFT( b.Piv, LEN( b.Piv ) - 1 )
FROM ( SELECT N'[Ingredient_v' + CONVERT( VARCHAR( 8 ), a.VersionID ) + '] = CASE'
+ ' WHEN p.[' + CONVERT( VARCHAR( 8 ), a.VersionID ) + '] IS NULL THEN NULL'
+ ' ELSE p.[Ingredient] END, '
+ '[Percentage_v' + CONVERT( VARCHAR( 8 ), a.VersionID ) + '] = p.['
+ CONVERT( VARCHAR( 8 ), a.VersionID ) + '], '
FROM ( SELECT DISTINCT r.VersionID
FROM dbo.Recipe r ) a
ORDER BY a.VersionID
FOR XML PATH ( '' ) ) b ( Piv );
SET @SQL = N'
SELECT p.Recipe, ' + @Col + '
FROM ( SELECT r.Recipe,
r.VersionID,
r.Ingredient,
r.Percentage
FROM dbo.Recipe r ) s
PIVOT ( MAX( s.Percentage )
FOR s.VersionID IN ( ' + @Piv + ' ) ) p
ORDER BY p.Recipe;';
EXECUTE dbo.sp_executesql @statement = @SQL;
GOSELECT p.Recipe,
[Ingredient_vA] = p.[Ingredient], [Percentage_vA] = ISNULL( p.[Percentage_vA], 0 ),
[Ingredient_vB] = p.[Ingredient], [Percentage_vB] = ISNULL( p.[Percentage_vB], 0 ),
[Ingredient_vC] = p.[Ingredient], [Percentage_vC] = ISNULL( p.[Percentage_vC], 0 ),
[Ingredient_vD] = p.[Ingredient], [Percentage_vD] = ISNULL( p.[Percentage_vD], 0 )
FROM ( SELECT Lvl = 'Percentage_v' + CHAR( 64 +
DENSE_RANK() OVER (
PARTITION BY r.Recipe
ORDER BY r.VersionID ) ),
r.Recipe,
r.Ingredient,
r.Percentage
FROM dbo.Recipe r ) s
PIVOT ( MAX( s.Percentage )
FOR s.Lvl IN ( [Percentage_vA], [Percentage_vB], [Percentage_vC], [Percentage_vD] ) ) p
ORDER BY p.Recipe;Context
StackExchange Database Administrators Q#65786, answer score: 8
Revisions (0)
No revisions yet.