debugsqlModerate
SQL Row concatenation with XML PATH and STUFF giving aggregate sql error
Viewed 0 times
patherrorsqlwithxmlstuffandrowconcatenationgiving
Problem
I am trying to query two tables and get results like the following:
The tables are:
I saw a few questions answered saying to use "XML PATH" and "STUFF" to query the data to get the results I am looking for, but I think there is something missing. I have tried the below query and get the error message:
Column 'a.AccountId' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
I don't have it in the SELECT clause of either query, but I assume the error is because AccountId is not unique in TableA.
Here is the query I am currently trying to get working correctly.
Section Names
shoes AccountName1, AccountName2, AccountName3
books AccountName1The tables are:
CREATE TABLE dbo.TableA(ID INT, Section varchar(64), AccountId varchar(64));
INSERT dbo.TableA(ID, Section, AccountId) VALUES
(1 ,'shoes','A1'),
(2 ,'shoes','A2'),
(3 ,'shoes','A3'),
(4 ,'books','A1');
CREATE TABLE dbo.TableB(AccountId varchar(20), Name varchar(64));
INSERT dbo.TableB(AccountId, Name) VALUES
('A1','AccountName1'),
('A2','AccountName2'),
('A3','AccountNAme3');I saw a few questions answered saying to use "XML PATH" and "STUFF" to query the data to get the results I am looking for, but I think there is something missing. I have tried the below query and get the error message:
Column 'a.AccountId' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
I don't have it in the SELECT clause of either query, but I assume the error is because AccountId is not unique in TableA.
Here is the query I am currently trying to get working correctly.
SELECT section, names= STUFF((
SELECT ', ' + Name FROM TableB as b
WHERE AccountId = b.AccountId
FOR XML PATH('')), 1, 1, '')
FROM TableA AS a
GROUP BY a.sectionSolution
Sorry, I missed a step in the relationship. Try this version (though Martin's will work as well):
An approach that is at least as good, but sometimes better, is switching from
At a high level, the reason
In any case, I ran both of these variations in SentryOne Plan Explorer. The plans are different in a few minor, uninteresting ways, but the I/O involved with the underlying worktable is telling. Here is
And here is
When I made the tables larger (14,000+ rows mapping to 24 potential values), this difference is more pronounced.
In SQL Server 2017, you can use
The I/O here is almost nothing:
But, if you're not on SQL Server 2017 (or Azure SQL Database), and can't use
Other enhancements from these posts:
Also see:
SELECT DISTINCT o.section, names= STUFF((
SELECT ', ' + b.Name
FROM dbo.TableA AS a
INNER JOIN dbo.TableB AS b
ON a.AccountId = b.AccountId
WHERE a.Section = o.Section
FOR XML PATH, TYPE).value(N'.[1]', N'varchar(max)'), 1, 2, '')
FROM dbo.TableA AS o;An approach that is at least as good, but sometimes better, is switching from
DISTINCT to GROUP BY:SELECT o.section, names= STUFF((
SELECT ', ' + b.Name
FROM dbo.TableA AS a
INNER JOIN dbo.TableB AS b
ON a.AccountId = b.AccountId
WHERE a.Section = o.Section
FOR XML PATH, TYPE).value(N'.[1]', N'varchar(max)'), 1, 2, '')
FROM dbo.TableA AS o
GROUP BY o.section;At a high level, the reason
DISTINCT applies to the entire column list. Therefore for any duplicates it has to perform the aggregate work for every duplicate before applying DISTINCT. If you use GROUP BY then it can potentially remove duplicates before doing any of the aggregation work. This behavior can vary by plan depending on a variety of factors including indexes, plan strategy, etc. And a direct switch to GROUP BY may not be possible in all cases.In any case, I ran both of these variations in SentryOne Plan Explorer. The plans are different in a few minor, uninteresting ways, but the I/O involved with the underlying worktable is telling. Here is
DISTINCT:And here is
GROUP BY:When I made the tables larger (14,000+ rows mapping to 24 potential values), this difference is more pronounced.
DISTINCT:GROUP BY:In SQL Server 2017, you can use
STRING_AGG:SELECT a.section, STRING_AGG(b.Name, ', ')
FROM dbo.TableA AS a
INNER JOIN dbo.TableB AS b
ON a.AccountId = b.AccountId
WHERE a.Section = a.Section
GROUP BY a.section;The I/O here is almost nothing:
But, if you're not on SQL Server 2017 (or Azure SQL Database), and can't use
STRING_AGG, I have to give credit where credit is due... Paul White's answer below has very little I/O and kicks the pants off of both of the FOR XML PATH solutions above.Other enhancements from these posts:
- Grouped Concatenation in SQL Server
- Grouped Concatenation : Ordering and Removing Duplicates
- Comparing string splitting / concatenation methods
Also see:
- Bad habits to kick : avoiding the schema prefix
Code Snippets
SELECT DISTINCT o.section, names= STUFF((
SELECT ', ' + b.Name
FROM dbo.TableA AS a
INNER JOIN dbo.TableB AS b
ON a.AccountId = b.AccountId
WHERE a.Section = o.Section
FOR XML PATH, TYPE).value(N'.[1]', N'varchar(max)'), 1, 2, '')
FROM dbo.TableA AS o;SELECT o.section, names= STUFF((
SELECT ', ' + b.Name
FROM dbo.TableA AS a
INNER JOIN dbo.TableB AS b
ON a.AccountId = b.AccountId
WHERE a.Section = o.Section
FOR XML PATH, TYPE).value(N'.[1]', N'varchar(max)'), 1, 2, '')
FROM dbo.TableA AS o
GROUP BY o.section;SELECT a.section, STRING_AGG(b.Name, ', ')
FROM dbo.TableA AS a
INNER JOIN dbo.TableB AS b
ON a.AccountId = b.AccountId
WHERE a.Section = a.Section
GROUP BY a.section;Context
StackExchange Database Administrators Q#137045, answer score: 16
Revisions (0)
No revisions yet.