HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Counting repeated occurences for a group of fields

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
countingrepeatedgroupfieldsoccurencesfor

Problem

I am trying to count the continuous repeated value for a field in order to obtain the maximum repeated occurrences for each person in the table. eg:

Time         person        Result
08:12:36     Dave          Heads
08:12:37     Dave          Heads
08:12:39     Sue           Tails
08:12 41     Dave          Tails
08:12:42     Dave          Heads
08:12:44     Dave          Heads
08:12:45     Sue           Heads
08:12:46     Sue           Heads
08:12:47     Sue           Tails
08:12:48     Dave          Heads
08:12:49     Dave          Heads
08:12:50     Dave          Tails
08:12:51     John          Heads


Result should look like this

Dave     4
Sue      2
John     1


The count for Dave would be 4 as between 08:12:42 and 08:12:49, 4 Heads options occur for Dave before Dave gets a Tails option.

Solution

SELECT person, max(ct) AS max_ct
FROM  (
   SELECT person, count(*) AS ct
   FROM  (
      SELECT person, result
           , row_number() OVER (PARTITION BY person ORDER BY time)
           - row_number() OVER (PARTITION BY person, result ORDER BY time) AS grp
      FROM   tablex
      ) sub1
   GROUP BY person, result, grp
   ) sub2
GROUP  BY person
ORDER  BY max(ct) DESC, person;


SQL Fiddle.

Result exactly as desired.

After forming groups (grp) of consecutive pairs (person, result), the count has to be per pair and group (person, result, grp), not just per person and group (person, grp). There can (and will) be one group with the same group number per person and result, those have to be counted separately.

Also, it's not efficient to use CTEs here. In Postgres, performance is generally superior for subqueries, since CTEs are optimization fences that always materialize the result. Use CTEs only where they are actually necessary.

Code Snippets

SELECT person, max(ct) AS max_ct
FROM  (
   SELECT person, count(*) AS ct
   FROM  (
      SELECT person, result
           , row_number() OVER (PARTITION BY person ORDER BY time)
           - row_number() OVER (PARTITION BY person, result ORDER BY time) AS grp
      FROM   tablex
      ) sub1
   GROUP BY person, result, grp
   ) sub2
GROUP  BY person
ORDER  BY max(ct) DESC, person;

Context

StackExchange Database Administrators Q#84779, answer score: 2

Revisions (0)

No revisions yet.