HiveBrain v1.2.0
Get Started
← Back to all entries
patternphpMinor

Return most common items in list

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
returnitemslistcommonmost

Problem

I was wondering about the best way to return the most common items in a list. So far the best way I could do it was:

//@param string $this->matched_ids_list = 1,11,12,11,12,
$this->matched_ids_list = "," . $this->matched_ids_list;
$this->ids_count = explode(",", $this->matched_ids_list);
$this->ids_count = array_unique($this->ids_count); //See below why this is done
foreach ($this->ids_count as $this->id) {
    $this->appearences[$this->id] = substr_count($this->matched_ids_list, ",{$this->id},");
    //[1] => 1, [11] => 2, [12] => 2
}
$this->most_appearences = max($this->appearences); //2
foreach ($this->appearences as $this->id => $this->times) {
    if ($this->times == $this->most_appearences) {
        $this->top_ids .= $this->id . ",";
    }
}
$this->top_ids = rtrim($this->top_ids, ",");
echo "tops = " . $this->top_ids;


It basically just works for 1,11,12,11,12, and returns 11,12, but it seems a bit overcomplicated. I've been at it for some time now. Am I doing more steps than necessary, missing a built-in function or something?

I might be dealing with hundreds of ids, so performance is important.

Would this step:

foreach ($this->ids_count as $this->id) {
    $this->appearences[$this->id] = substr_count($this->matched_ids_list, ",{$this->id},");
    //[1] => 1, [11] => 2, [12] => 2
}


do more looping than necessary if array_unique wasn't used?

For instance, if ids_count was

[0] => 1, [1] => 11, [2] => 12, [3] => 11, [4] => 12


it would count how many times 11 and 12 appear on matched_ids_list twice. Instead, I use array_unique so ids_count is

[0] =>, [1] => 1, [2] => 11, [3] => 12

Solution

Skip to the Dramatic Update for the best solution

I was curious how to do this myself, because I could clearly tell there had to be an easier way! I started from scratch and I think I've found exactly what this question is asking for.

I don't normally just rewrite code, but since the question asks the best way, I believe I'm obligated to share what I believe is the "bets way".

$input = [1, 10, 11, 12, 11, 12, 18, 18];

$recorded = array_count_values($input);

arsort($recorded);

$output = [];

foreach ($recorded as $key => $value) {
    if ($value == max($recorded)) {
        array_push($output, $key);
    }
}

print_r($output);


I'll explain what's happening here:

  • We receive the input in $input



  • array_count_values() returns an array using the values of array as keys and their frequency in array as values. In our case, from $input.



  • arsort — Sort an array in reverse order and maintain index association



  • Loop over each pair in the newly sorted array. Create some variables to reference the current pair.



  • If the current value is equal to the largest value from $recorded



  • Add that key to the output!



And there we have it!

Update to increase performance

$input = [1, 10, 11, 12, 11, 12, 18, 18];

$recorded = array_count_values($input);

arsort($recorded);

$output = [];

$maximum = max($recorded);

foreach ($recorded as $key => $value) {
    if ($value == $maximum) {
        $output[] = $key;
    } else if ($value < $maximum) {
        break;
    }
}

print_r($output);


We've stored the maximum so we aren't checking every iteration, and we're checking for a value less than the maximum so we can break out of the loop to prevent going through the whole array.
Dramatic Update

So, I completely missed the array_keys function, as pointed out by shudder. The newly improved code would look something like:

$input = [1, 10, 11, 12, 11, 12, 18, 18];

$recorded = array_count_values($input);

$output = array_keys($recorded, max($recorded));

print_r($output);


Crazy improvement compared to the OPs original code!

Code Snippets

$input = [1, 10, 11, 12, 11, 12, 18, 18];

$recorded = array_count_values($input);

arsort($recorded);

$output = [];

foreach ($recorded as $key => $value) {
    if ($value == max($recorded)) {
        array_push($output, $key);
    }
}

print_r($output);
$input = [1, 10, 11, 12, 11, 12, 18, 18];

$recorded = array_count_values($input);

arsort($recorded);

$output = [];

$maximum = max($recorded);

foreach ($recorded as $key => $value) {
    if ($value == $maximum) {
        $output[] = $key;
    } else if ($value < $maximum) {
        break;
    }
}

print_r($output);
$input = [1, 10, 11, 12, 11, 12, 18, 18];

$recorded = array_count_values($input);

$output = array_keys($recorded, max($recorded));

print_r($output);

Context

StackExchange Code Review Q#24446, answer score: 2

Revisions (0)

No revisions yet.