HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Codefights subsequence

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
codefightssubsequencestackoverflow

Problem

I am solving https://codefights.com/challenge/Qjts7cukDvYpDW4Bc/main

My approach is simple.
Get all subsequences by combination.
Simple loop over combinations.
Find minimum.

closestSequence2 <- function(a, b) {
LA = length(a);
LB = length(b);
minimum = Inf;

indexes = utils::combn(1:LB, LA);

for(i in 1:ncol(indexes)) 
{ 
 suma = (sum(abs(b[indexes[,i]]-a)));
 if(suma < minimum )
 {
   minimum = suma; 
 }
}  
return(minimum);   
}


I am not sure but I think it is failing due to the time because it pass the sample tests. Are there some easy tricks how to speed this up ot do I need different approach?

Solution

You can speed up your code by replacing the for loop with more efficient (vectorized) functions. Your code can then be reduced to a simple one-liner:

closestSequence2 <- function(a, b) min(colSums(abs(combn(b, length(a)) - a)))


Looking at it step-by-step:

  • combn is applied directly to b rather than 1:b. The output is a matrix where each column is a length(a)-long sub-sequence of b.



  • combn() - a takes advantage of R's recycling rules to compute, for each item in each sub-sequence, the signed distance to the corresponding element of a.



  • abs() converts to absolute (unsigned) distances.



  • colSums summarizes each column into a single value: the total distance from a for that sub-sequence.



  • min picks the minimum total distance across all candidates.



However, as pointed, an exhaustive search will not work well for large input vectors. A much faster approach is indeed to use dynamic programming. Here is a tentative implementation:

closestSequence2  La) for (j in 1:(Lb - La)) {
        x[1 + i, 1 + j] <- min(x[1 + i, j],
                               x[i, 1 + j] + d[i, i + j])
     }
  }
  min(x[La + 1, ]) # min value on the last row
}

Code Snippets

closestSequence2 <- function(a, b) min(colSums(abs(combn(b, length(a)) - a)))
closestSequence2 <- function(a, b) {
  La <- length(a)
  Lb <- length(b)
  d <- abs(outer(a, b, FUN = `-`)) # matrix of all distances

  # x[1 + i, 1 + j] will hold the optimal distance between
  # the i first elements of a and the best sub-sequence
  # within the i+j first elements of b.
  x <- matrix(NA, La + 1, Lb - La + 1)
  x[ 1,  ] <- 0                             # init case where i = 0
  x[-1, 1] <- cumsum(d[cbind(1:La, 1:La)])  # init case where j = 0
  for (i in 1:La) {
     if (Lb > La) for (j in 1:(Lb - La)) {
        x[1 + i, 1 + j] <- min(x[1 + i, j],
                               x[i, 1 + j] + d[i, i + j])
     }
  }
  min(x[La + 1, ]) # min value on the last row
}

Context

StackExchange Code Review Q#143128, answer score: 3

Revisions (0)

No revisions yet.