HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

Calculating Lost Reputation

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
calculatinglostreputation

Problem

A discussion arose not long ago on the 2nd Monitor about how much reputation has been lost due to the reputation caps. There are a number of queries on the SEDE which try to address this:

  • A Users total Rep ON META if there was no reputation cap



  • Percent lost to reputation cap among top 1000 users



  • Top-rated users if there was no reputation cap



All of these queries (there are others as well) appear to suffer from the same two flaws... ( ... until now with the query in this question .... )

  • the user's actual reputation is affected by things that are complicated, like posts being deleted, migrated on to, and off from the site, the starting rep bonus, and other things.



  • that you cannot calculate a person's lost reputation without knowing the order in which the votes happened.



To understand why the post migration and deletion is important, you have to understand that reputation caps happens on a daily cycle. The number of days that people max their rep is much lower (for non-Skeets) than the number of times things are deleted, etc. There are things that happen on days other than rep-max days, that affect your reputation, and mean that voting records for those things are not available on the Data Explorer. Your user record records your total reputation, but there is no way to sum up all the reputation events and re-create that. Thus, any attempts to figure out how much rep is lost will come up short, or long, and this leads to funny things like negative-lost-reputation.

To understand why the order is significant, consider the following poor user 'Bob':

  • Lucky day, and ends up at exactly 200 rep from a good answer.



  • Someone upvotes again, and he loses 10 rep to the cap.



  • Someone downvotes and he loses 2 (198)



  • Someone downvotes again, he loses another 2



  • Now his rep is 196 and the day ends ... sorry, no mortarboard for you, and you lost 10 rep



If the order was different, then it could be lucky 'Bill' instead:

  • Lucky day, and ends up at exactly 2

Solution

Going through this code again, it appears there are a few things that could be improved.

Use #Temp Table for User Selection

Instead of doing a cheap-shot UNION select to get the set of users to process, the right way would be to create a temp table, and then conditionally populate it:

create table #TopUsers (
    UserId int not null,
    Reputation int not null)

if @userid < 0
begin

    insert into #TopUsers
    select Users.Id as UserId, Users.Reputation as Reputation
    from Users, Badges
    where badges.Name = 'Mortarboard'
      and Badges.UserId = Users.id

end else begin

    insert into #TopUsers
    select Id, Reputation
    from Users
    where Id = @userid

end

declare TOPUSERS cursor for
  select *
  from #TopUsers


This approach is more verbose, but the intent of the code is clearer and it makes it more maintainable.

Explicit vs. Implicit JOIN

Explicit joins have been available in SQL since the SQL-92 standard. The code in this question uses the coding standards from the older SQL-89 standard.

There is nothing technically wrong with the implicit join syntax used in the code, but it is old-fashioned, and using explicit join syntax allows the join conditions in larger queries to be more obvious... changes would be, for example:

select Users.Id as UserId, Users.Reputation as Reputation
from Users, Badges
where @userid < 0
  and badges.Name = 'Mortarboard'
  and Badges.UserId = Users.id


would become:

select Users.Id as UserId, Users.Reputation as Reputation
from Users
inner join Badges
        on Badges.UserId = Users.id
where @userid < 0
  and badges.Name = 'Mortarboard'

Code Snippets

create table #TopUsers (
    UserId int not null,
    Reputation int not null)

if @userid < 0
begin

    insert into #TopUsers
    select Users.Id as UserId, Users.Reputation as Reputation
    from Users, Badges
    where badges.Name = 'Mortarboard'
      and Badges.UserId = Users.id

end else begin

    insert into #TopUsers
    select Id, Reputation
    from Users
    where Id = @userid

end

declare TOPUSERS cursor for
  select *
  from #TopUsers
select Users.Id as UserId, Users.Reputation as Reputation
from Users, Badges
where @userid < 0
  and badges.Name = 'Mortarboard'
  and Badges.UserId = Users.id
select Users.Id as UserId, Users.Reputation as Reputation
from Users
inner join Badges
        on Badges.UserId = Users.id
where @userid < 0
  and badges.Name = 'Mortarboard'

Context

StackExchange Code Review Q#40595, answer score: 10

Revisions (0)

No revisions yet.