patterngitMinor
Git Insert Delete Graph
Viewed 0 times
insertgitdeletegraph
Problem
I attempted to recreate the GitHub code frequency graph (example) with a daily granularity using Perl and
It should be noted that the default application for open SVG files should be set to a browser of some sort.
`#!/usr/bin/env perl
# Henry J Schmale
# November 4, 2015
#
# Creates an insertion and delation graph per day graph for a git repo. It
# outputs an svg of the graph on standard output.
#
# This script can take the name of a directory to produce the graph for
# that directory if no param is given, then it does it in the current
# directory.
#
# Requires SVG::TT:Graph::Line
use strict;
use warnings;
# Get the path to the stylesheet first
use File::Spec;
use File::Basename;
my $graphsty = dirname(File::Spec->rel2abs(__FILE__)) . '/svg-graph-ss.css';
# CD into the directory specified if specified
if(-e $ARGV[0] and -d $ARGV[0]){
chdir $ARGV[0];
}
# Indexed by date
my %commits;
# get the git log and preprocess it
my $gitlogOutput = qx(git log --numstat --pretty="%H %aI" | grep -v '^\$');
my @lines = split /\n/, $gitlogOutput;
my $date;
my $hash;
foreach (@lines) {
chomp;
my @fields = split /\s+/;
# Length of sha1sum
if(length($fields[0]) > 39){
$hash = $fields[0];
$date = substr($fields[1], 0, 10);
}else{
$commits{$date}->{ins} += $fields[0];
$commits{$date}->{del} += $fields[1];
}
}
use DateTime;
use Date::Parse;
use Data::Dumper;
my $firstDate = getDateTime(((sort keys %commits)[0]));
my $lastDate = getDateTime(((sort keys %commits)[-1]));
# print "$firstDate\t".((sort keys %commits)[0])."\n";
# print "$lastDate\t".((sort keys %commits)[-1])."\n";
# print (scalar keys %commits)."\n";
while($firstDate->add(days => 1) ymd('-');
if(!defined $commits{$key}){
$commits{$key}->{ins} = 0;
$comm
git log. How did I do, and what improvements can I make? I know that I should try to reduce the frequency of keys on the x-axis, but I have no idea how to do that and maintain scale.It should be noted that the default application for open SVG files should be set to a browser of some sort.
`#!/usr/bin/env perl
# Henry J Schmale
# November 4, 2015
#
# Creates an insertion and delation graph per day graph for a git repo. It
# outputs an svg of the graph on standard output.
#
# This script can take the name of a directory to produce the graph for
# that directory if no param is given, then it does it in the current
# directory.
#
# Requires SVG::TT:Graph::Line
use strict;
use warnings;
# Get the path to the stylesheet first
use File::Spec;
use File::Basename;
my $graphsty = dirname(File::Spec->rel2abs(__FILE__)) . '/svg-graph-ss.css';
# CD into the directory specified if specified
if(-e $ARGV[0] and -d $ARGV[0]){
chdir $ARGV[0];
}
# Indexed by date
my %commits;
# get the git log and preprocess it
my $gitlogOutput = qx(git log --numstat --pretty="%H %aI" | grep -v '^\$');
my @lines = split /\n/, $gitlogOutput;
my $date;
my $hash;
foreach (@lines) {
chomp;
my @fields = split /\s+/;
# Length of sha1sum
if(length($fields[0]) > 39){
$hash = $fields[0];
$date = substr($fields[1], 0, 10);
}else{
$commits{$date}->{ins} += $fields[0];
$commits{$date}->{del} += $fields[1];
}
}
use DateTime;
use Date::Parse;
use Data::Dumper;
my $firstDate = getDateTime(((sort keys %commits)[0]));
my $lastDate = getDateTime(((sort keys %commits)[-1]));
# print "$firstDate\t".((sort keys %commits)[0])."\n";
# print "$lastDate\t".((sort keys %commits)[-1])."\n";
# print (scalar keys %commits)."\n";
while($firstDate->add(days => 1) ymd('-');
if(!defined $commits{$key}){
$commits{$key}->{ins} = 0;
$comm
Solution
A simple thing to add would be a Logarithmic Y-scaling function, so that significant bursts in either losses or gains don't get over-exaggerated perceptually. Essentially you want 100-commit days to only take up at most, 10x as much space as 1 commit days, so "low activity" wavering is still visible.
I'd also suggest breaking windows of about 7 days into single data points, by aggregating them and representing them as 5 lines each:
And then render it with some shades like this ( half-assed mockup )
With this, you could afford to narrow the sample window to "per hour" say, but still represent the data accumulated in terms of weeks, so you'd have "max commits/hour this week", "min commits/hour this week", etc, all represented in a single vertical column.
I'd also suggest breaking windows of about 7 days into single data points, by aggregating them and representing them as 5 lines each:
- Max number of commits per day in 7-day-range,
- Upper Quartile number of commits per day in 7-day-range
- Median number of commits per day ...
- Lower Quartile ....
- Min...
And then render it with some shades like this ( half-assed mockup )
With this, you could afford to narrow the sample window to "per hour" say, but still represent the data accumulated in terms of weeks, so you'd have "max commits/hour this week", "min commits/hour this week", etc, all represented in a single vertical column.
Context
StackExchange Code Review Q#110865, answer score: 4
Revisions (0)
No revisions yet.