patternsqlMinor

Implementing "ANY" aggregate operator in T-SQL

Submitted by: @import:stackexchange-dba·Mar 10, 2026·

Viewed 0 times

operatorsqlanyimplementingaggregate

Problem

I have a database that contains department and employee tables, and I need to return the number of employees for each department. I write the following query:

SELECT d.ID, d.Name, COUNT(*) EmployeeCount
  FROM dbo.Departments d
       INNER JOIN dbo.Personnel p 
       ON p.DepartmentID = d.ID
GROUP BY d.ID, d.Name;

The problem is, I'm worried that the GROUP BY operator is going to be slow when working on a VARCHAR column. I'd like to be able to write

SELECT d.ID, ANY(d.Name), COUNT(*)  EmployeeCount
  FROM dbo.Departments d
       INNER JOIN dbo.Personnel p 
       ON p.DepartmentID = d.ID
GROUP BY d.ID;

I could probably use the MIN/MAX aggregate functions, but they would likely make the query even slower. I know that, in MySQL, you can select a column that doesn't appear in the GROUP BY clause, and the DB engine will return any row value at random for that column. Is something like that doable in T-SQL?

Solution

I think you are trying to optimize a performance problem that doesn't exist. I don't know why you think a GROUP BY is going to be problematic here. Conceptually there isn't really much difference between what you wrote and the following:

SELECT d.Id, d.Name, s.c
  FROM dbo.Departments AS d
  INNER JOIN 
  (
    SELECT ID = DepartmentID, c = COUNT(*)
    FROM dbo.Personnel 
    GROUP BY DepartmentID
  ) AS s
  ON d.ID = s.ID;

So this query doesn't do the GROUP BY on the Name column, but in my testing this is actually less efficient than your version unless you create a non-clustered index on Personnel.DepartmentID (in which case the plans are exactly the same). I suggest optimizing performance when you actually observe a performance problem, not based on hunches. In your case the grouping essentially comes along for free because it's part of the same clustered index scan that has to read all the data anyway. I'm not sure how forcing SQL Server to pick a random name (which is going to be in the same row as the ID you already read anyway) can possibly make anything faster...

Code Snippets

SELECT d.Id, d.Name, s.c
  FROM dbo.Departments AS d
  INNER JOIN 
  (
    SELECT ID = DepartmentID, c = COUNT(*)
    FROM dbo.Personnel 
    GROUP BY DepartmentID
  ) AS s
  ON d.ID = s.ID;

Context

StackExchange Database Administrators Q#22120, answer score: 8

Revisions (0)

No revisions yet.