HiveBrain v1.2.0
Get Started
← Back to all entries
patternphpMinor

Building a histogram array in PHP

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
arrayphphistogrambuilding

Problem

I'm visualizing my stored data in an histogram similar to this one:

However, I'm sure that my code smells a lot! So I'd like to have your opinion on how to make it better.

$listings = array(
              array('Price' => 10),
              array('Price' => 10),
              array('Price' => 11)
            ); // so on and so forth, this data is pulled from the database. 

// flattening the array so I remove the "Price" key
// the array takes the form of array(10,10,11,...)
foreach($listings as $listing) {
    $flattenedListings[] = $listing['Price'];
}

$widths = range(0, 100, 10); // creates array of the form: array(0, 10, 20, 30, 40, ...)
$bins = array();
$isLast = count($widths);
foreach($widths as $key => $value) {
    if($key  $value, 'max' => $widths[$key+1]);
    }
}

// creates array of the form:
// $bins = array(
//      array('min'=>0, 'max' => 10),
//      array('min'=>10,'max' => 20),
//      array('min'=>30, 'max'=>40)
//     );

$histogram = array();
        foreach($bins as $bin) {
            $histogram[$bin['min']."-".$bin['max']] = array_filter($flattenedListings, function($element) use ($bin) {
                if( ($element > $bin['min']) && ($element  array(1, 2, 3, 4, 5, 6), 
//         '10-20' => array(11, 19, 12),
//         '20-30' => array(),
//   );
// Or in other words: foreach range in the histogram, php creates an array containing      values within the allowed limits.

    foreach($histogram as $key => $val) {
        $flotHistogram[$key] = (is_array($val)) ? ( (count($val)) ? count($val) : 0 ) : 0; 
    }

// And finally it just counts them, and returns a new array.


As you can see in my code, I'm making an excessive use of foreach loops to basically do a "simple task."

Basically what I have is a list of items along with their prices, and I want to count how many items belong to each range. My ranges are 10 numbers apart: (0 - 10, 10 - 20, 20 - 30, etc...)

Ideally the output should tell me how many items there are in

Solution

Your code isn't exactly terrible, but this is code-review, and good code review has to be harsh, so please keep in mind that I mean this in the nicest of ways.

That said, there are some issues I'd like to address:

Redundancies - DNRY:

That old dead horse: Do Not Repeat Yourself. You have a lot of redundant code. Of course, you can get into code-golfing, and try to write everything as a one-liner. That will result in unmaintainable code, and that's not what we want, but in your case, there is room for some minor improvements that won't harm readability one bit, quite the contrary.

You say you're getting the data (prices, I take it) from a PDO resultset, why, then, don't you fill the $flattenedListings array like so:

$flattenedListings = array();
while($row = $pdoResOrStmt->fetch(PDO::FETCH_ASSOC))
{
    $flattenedListings[] = $row['Price'];
}


That's going to reduce the resources your script requires a tad, and fetches the data while at the same time constructing the flattened array.

Next, the $isLast = count($widths); variable can be ditched all together, either by a simple foreach, very close to to the loop you currently have:

$widths = range(0, 100, 10);
$bins = array();
foreach($widths as $key => $val)
{
    if (!isset($widths[$key +1])) break;//break on the last key
    $bins[] = array(
        'min' => $val,
        'max' => $widths[$key + 1]
    );
}


But, and this might be personal, I dislike this if() break; thing in a loop. Personally, I'd probably write something like:

for($i=0, $j=count($widths) -1; $i next to last key
    $bins[] = array(
        'min' => $widths[$i],
        'max' => $widths[$i+1],//can get last key
    );
}


Next, the $histogram. You're using a closure function + array_filter to group your values according to the price range. In itself, it's quite easy to read, and not that inefficient. But I find that closures in PHP (which are objects, instances of Closure) are 9/10 overkill. They're also not a "natural" feature of the language. PHP isn't a functional language, like Lisp, Scheme or JavaScript are. I've compared them to drag queens: A man can wear a dress, stockings, a wig and put on make-up as much as he wants, it still doesn't change the fact that he's a man.

Often, what you end up with is a grotesque, bemusing performance that leaves some feeling ill at ease.

There is a far easier way of determining the range for each price:

$histogram = array();
foreach($flattenedListings as $price)
{//assuming nothing costs 0
    $key = floor(($price -1)/10);
    $range = $bins[$key];
    $key = $range['min'] .'-'. $range['max'];
    if (!isset($histogram[$key])) $histogram[$key] = array();
    $histogram[$key][] = $price;
}


If some price can be zero, then just replace:

$key = floor(($price -1)/10);


with

$key = $price ? floor(($price -1)/10) : 0;


And you're good to go.

Simplify:

Now we've got that sorted out, it's time to simplify things a bit. As you can see by the way I've filled $histogram, I don't actually need the min and max values from the $bins array, so instead of filling that array with array('min'=> $x, 'max'=> $x+1), I could easily have written:

$widths = range(0, 100, 10);
$bins = array();
foreach($widths as $key => $val)
{
    if (!isset($widths[$key + 1])) break;
    $bins[] = $val.'-'. ($widths[$key + 1]);
}


This also makes filling the $flotHistogram array a lot easier:

$flotHistogram = array_fill_keys($bins, 0);//set all keys to 0


Then, when filling the $histogram:

$histogram = array();
foreach($flattenedListings as $price)
{
    $key = $bins[floor(($price -1)/10)];
    if (!isset($histogram[$key])) $histogram[$key] = array();
    $flotHistogram[$key]++;//add 1 count
    $histogram[$key][] = $price;
}


This leaves us with the following code:

//get prices
$flattenedListings = array();
while($row = $pdoResOrStmt->fetch(PDO::FETCH_ASSOC))
{
    $flattenedListings[] = $row['Price'];
}
//construct range-keys array
$widths = range(0, 100, 10);
$bins = array();
foreach($widths as $key => $val)
{
    if (!isset($widths[$key + 1])) break;
    $bins[] = $val.'-'. ($widths[$key + 1]);
}
//construct flotHistogram count array
$flotHistogram = array_fill_keys($bins, 0);
//construct array of prices for each key
$histogram = array();
foreach($flattenedListings as $price)
{
    $key = $bins[floor(($price -1)/10)];
    if (!isset($histogram[$key])) $histogram[$key] = array();
    $flotHistogram[$key]++;
    $histogram[$key][] = $price;
}


A bridge too far(?)

I may be taking things a bit too far, but that $flattenedListings array seems a bit redundant, now. However I don't know what else you need the db results for, but all in all, you may want to consider doing something like

```
$widths = range(0, 100, 10);
$bins = array();
foreach($widths as $key => $val)
{
if (!isset($widths[$key + 1])) break;
$bins[] = $val.'-'. ($widths[$key + 1]);
}

$flotHistogr

Code Snippets

$flattenedListings = array();
while($row = $pdoResOrStmt->fetch(PDO::FETCH_ASSOC))
{
    $flattenedListings[] = $row['Price'];
}
$widths = range(0, 100, 10);
$bins = array();
foreach($widths as $key => $val)
{
    if (!isset($widths[$key +1])) break;//break on the last key
    $bins[] = array(
        'min' => $val,
        'max' => $widths[$key + 1]
    );
}
for($i=0, $j=count($widths) -1; $i<$j;++$i)
{//count -1 = last key, but $i <$j => next to last key
    $bins[] = array(
        'min' => $widths[$i],
        'max' => $widths[$i+1],//can get last key
    );
}
$histogram = array();
foreach($flattenedListings as $price)
{//assuming nothing costs 0
    $key = floor(($price -1)/10);
    $range = $bins[$key];
    $key = $range['min'] .'-'. $range['max'];
    if (!isset($histogram[$key])) $histogram[$key] = array();
    $histogram[$key][] = $price;
}
$key = floor(($price -1)/10);

Context

StackExchange Code Review Q#36266, answer score: 6

Revisions (0)

No revisions yet.