HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Not getting the concept of COUNT with GROUP BY?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
thegroupwithconceptgettingcountnot

Problem

This is not my job, neither my homework. I'm just self-learnin from Prof. Donatus course on udemy and these are the problems he used in his course.

Objective 1:
Your logistics manager wants to know the number of times each ship region appears in the company database. Use SQL query to get the details.

Objective 2:
Using the shipregion column, count how many times each value has appeared in the column.

The database: Northwind

The table: orders

Solution of 1:

SELECT ShipRegion,
    COUNT(ShipRegion)
FROM
    orders
GROUP BY ShipRegion;


Solution of 2:

SELECT ShipRegion,
    COUNT(*)
FROM
    orders
GROUP BY ShipRegion;


I understand the first solution but not the second solution.

My confusion: How does doing COUNT(*) makes sure to count all shipregions only? Is it because group by shipregion is used?

If that's the case, in above solution, we are also usingg COUNT(shipregion). I don't understand. Please help.

Solution

The COUNT() function is an aggregate function that returns the number of rows in a table. The COUNT() function allows you to count all rows or only rows that match a specified condition.

The COUNT() function has three forms:

  • COUNT(*),



  • COUNT(expression)



  • COUNT(DISTINCT expression)



COUNT(*) function

The COUNT() function returns the number of rows in a result set returned by a SELECT statement. The COUNT() returns the number of rows including duplicate, non-NULL and NULL rows.

COUNT(expression)

The COUNT(expression) returns the number of rows that do not contain NULL values as the result of the expression.

COUNT(DISTINCT expression)

The COUNT(DISTINCT expression) returns the number of distinct rows that do not contain NULL values as the result of the expression.

The return type of the COUNT() function is BIGINT. The COUNT() function returns 0 if there is no matching row found.

Consider a simplified example based on your data

CREATE TABLE   `orders` (
  `ShipRegion` varchar(6) 
) ;

INSERT INTO `orders` VALUES (null),(null),('RJ'),('RJ'),('RJ'),('SP'),('NM'),('NM');


Query1,

SELECT ShipRegion,
      COUNT(*)
FROM orders
GROUP BY ShipRegion;


Result

ShipRegion  COUNT(*)
   null       2
   RJ         3
   SP         1
   NM         2


Query2,

SELECT ShipRegion,
       COUNT( ShipRegion)
FROM  orders
GROUP BY ShipRegion;


Result

ShipRegion  COUNT( ShipRegion)
    null         0
    RJ           3
    SP           1
    NM           2


Query3,

SELECT ShipRegion,
       COUNT(distinct ShipRegion)
FROM  orders
GROUP BY ShipRegion;


Result

ShipRegion  COUNT(distinct ShipRegion)
    null        0
    NM          1
    RJ          1
    SP          1


My confusion: How does doing COUNT(*) makes sure to count all
shipregions only? Is it because group by shipregion is used?

Yes, The COUNT(*) function is often used with a GROUP BY clause to return the number of elements in each group.

In the first Objective I think you need to apply where shipregion is not null, because null is not equal to anything

SELECT ShipRegion,
      COUNT(*)
FROM orders
WHERE ShipRegion is not null
GROUP BY ShipRegion;


See demo here

Read more on Counting Rows

Code Snippets

CREATE TABLE   `orders` (
  `ShipRegion` varchar(6) 
) ;

INSERT INTO `orders` VALUES (null),(null),('RJ'),('RJ'),('RJ'),('SP'),('NM'),('NM');
SELECT ShipRegion,
      COUNT(*)
FROM orders
GROUP BY ShipRegion;
ShipRegion  COUNT(*)
   null       2
   RJ         3
   SP         1
   NM         2
SELECT ShipRegion,
       COUNT( ShipRegion)
FROM  orders
GROUP BY ShipRegion;
ShipRegion  COUNT( ShipRegion)
    null         0
    RJ           3
    SP           1
    NM           2

Context

StackExchange Database Administrators Q#327906, answer score: 5

Revisions (0)

No revisions yet.