HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Vector comparison inside for-loop

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
loopcomparisonforvectorinside

Problem

The following code finds no-rain periods in a time series that were preceded by rain over a threshold (e.g. 10mm in 2 days [k=48]).

significant_drydowns  0) & (rain == 0) 
    groups  0] <- NaN
    groups[groups == 0] <- NaN

    # remove drydowns where previous rain is below threshold
    past_rain <- rollsum(rain, k, fill=0, align='right')
    for (t in which(starts)) {
        if (past_rain[t-1] < threshold) {
            groups[groups == groups[t]] <- NaN
        }
    }

    return(groups)
}


The for loop is really slow, partly because of the == comparison. Is there any way to speed this code up?

Example data:

rain <-c(0,0,0,2,3,0,0,0,1,5,0,0,0,0,0,0,6,1,1,1,0,0,0)
names(rain) <- rain


Example output:

R> significant_drydowns(rain, 5, 5)
  0   0   0   2   3   0   0   0   1   5   0   0   0   0   0   0   6   1   1   1   0   0   0 
NaN NaN NaN NaN NaN   1   1   1 NaN NaN   2   2   2   2   2   2 NaN NaN NaN NaN   3   3   3 
R> significant_drydowns(rain, 7, 4)
  0   0   0   2   3   0   0   0   1   5   0   0   0   0   0   0   6   1   1   1   0   0   0 
 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN   3   3   3


So, the group names don't matter, as long as they are unique. Groups are only assigned to dry-downs for which the previous k steps has a sum greater than threshold.

Solution

The following function works without for loops or any function of the *apply family. Furthermore, it does not require additional packages but makes use of base functions only. See the code comments for further details.

significant_drydowns  0 & !rain_tail
  # no-rain groups
  groups = threshold
  # replace `groups` values with NA if 
  #  (a) there is rain or 
  #  (b) this is the first rain or no-rain period
  is.na(groups) <- rain_tail | !groups
  # replace `groups` values with NA if amount of rain is below threshold
  # (here `groups` is used as a numeric index for `valid`)
  is.na(groups) <- !valid[groups]
  # add NA to match length of original vector and set names
  setNames(c(NA_integer_, groups), rain)
}


Some results:

> significant_drydowns(rain, 5, 5)
 0  0  0  2  3  0  0  0  1  5  0  0  0  0  0  0  6  1  1  1  0  0  0 
NA NA NA NA NA  1  1  1 NA NA  2  2  2  2  2  2 NA NA NA NA  3  3  3 
> significant_drydowns(rain, 7, 4)
 0  0  0  2  3  0  0  0  1  5  0  0  0  0  0  0  6  1  1  1  0  0  0 
NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA  3  3  3

Code Snippets

significant_drydowns <- function(rain, threshold = 10, k = 48) {
  # all values except the first one
  rain_tail <- rain[-1L]
  # logical index of start of no-rain period
  starts <- head(rain, -1L) > 0 & !rain_tail
  # no-rain groups
  groups <- cumsum(starts)
  # sum of the amount of rain in the last k elements
  # (for e.g., k = 5 the filter is c(1,1,1,1,1), therefore
  # the sum of 5 preceding elements is calculated)
  past_rain <- filter(rain, rep.int(1L, k), sides = 1L)
  # valid groups (previous amount of rain exceeds threshold)
  valid <- past_rain[starts] >= threshold
  # replace `groups` values with NA if 
  #  (a) there is rain or 
  #  (b) this is the first rain or no-rain period
  is.na(groups) <- rain_tail | !groups
  # replace `groups` values with NA if amount of rain is below threshold
  # (here `groups` is used as a numeric index for `valid`)
  is.na(groups) <- !valid[groups]
  # add NA to match length of original vector and set names
  setNames(c(NA_integer_, groups), rain)
}
> significant_drydowns(rain, 5, 5)
 0  0  0  2  3  0  0  0  1  5  0  0  0  0  0  0  6  1  1  1  0  0  0 
NA NA NA NA NA  1  1  1 NA NA  2  2  2  2  2  2 NA NA NA NA  3  3  3 
> significant_drydowns(rain, 7, 4)
 0  0  0  2  3  0  0  0  1  5  0  0  0  0  0  0  6  1  1  1  0  0  0 
NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA  3  3  3

Context

StackExchange Code Review Q#64608, answer score: 6

Revisions (0)

No revisions yet.