patternpythonMinor
Vector comparison inside for-loop
Viewed 0 times
loopcomparisonforvectorinside
Problem
The following code finds no-rain periods in a time series that were preceded by rain over a threshold (e.g. 10mm in 2 days [k=48]).
The
Example data:
Example output:
So, the group names don't matter, as long as they are unique. Groups are only assigned to dry-downs for which the previous k steps has a sum greater than threshold.
significant_drydowns 0) & (rain == 0)
groups 0] <- NaN
groups[groups == 0] <- NaN
# remove drydowns where previous rain is below threshold
past_rain <- rollsum(rain, k, fill=0, align='right')
for (t in which(starts)) {
if (past_rain[t-1] < threshold) {
groups[groups == groups[t]] <- NaN
}
}
return(groups)
}The
for loop is really slow, partly because of the == comparison. Is there any way to speed this code up?Example data:
rain <-c(0,0,0,2,3,0,0,0,1,5,0,0,0,0,0,0,6,1,1,1,0,0,0)
names(rain) <- rainExample output:
R> significant_drydowns(rain, 5, 5)
0 0 0 2 3 0 0 0 1 5 0 0 0 0 0 0 6 1 1 1 0 0 0
NaN NaN NaN NaN NaN 1 1 1 NaN NaN 2 2 2 2 2 2 NaN NaN NaN NaN 3 3 3
R> significant_drydowns(rain, 7, 4)
0 0 0 2 3 0 0 0 1 5 0 0 0 0 0 0 6 1 1 1 0 0 0
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 3 3 3So, the group names don't matter, as long as they are unique. Groups are only assigned to dry-downs for which the previous k steps has a sum greater than threshold.
Solution
The following function works without
Some results:
for loops or any function of the *apply family. Furthermore, it does not require additional packages but makes use of base functions only. See the code comments for further details.significant_drydowns 0 & !rain_tail
# no-rain groups
groups = threshold
# replace `groups` values with NA if
# (a) there is rain or
# (b) this is the first rain or no-rain period
is.na(groups) <- rain_tail | !groups
# replace `groups` values with NA if amount of rain is below threshold
# (here `groups` is used as a numeric index for `valid`)
is.na(groups) <- !valid[groups]
# add NA to match length of original vector and set names
setNames(c(NA_integer_, groups), rain)
}Some results:
> significant_drydowns(rain, 5, 5)
0 0 0 2 3 0 0 0 1 5 0 0 0 0 0 0 6 1 1 1 0 0 0
NA NA NA NA NA 1 1 1 NA NA 2 2 2 2 2 2 NA NA NA NA 3 3 3
> significant_drydowns(rain, 7, 4)
0 0 0 2 3 0 0 0 1 5 0 0 0 0 0 0 6 1 1 1 0 0 0
NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 3 3 3Code Snippets
significant_drydowns <- function(rain, threshold = 10, k = 48) {
# all values except the first one
rain_tail <- rain[-1L]
# logical index of start of no-rain period
starts <- head(rain, -1L) > 0 & !rain_tail
# no-rain groups
groups <- cumsum(starts)
# sum of the amount of rain in the last k elements
# (for e.g., k = 5 the filter is c(1,1,1,1,1), therefore
# the sum of 5 preceding elements is calculated)
past_rain <- filter(rain, rep.int(1L, k), sides = 1L)
# valid groups (previous amount of rain exceeds threshold)
valid <- past_rain[starts] >= threshold
# replace `groups` values with NA if
# (a) there is rain or
# (b) this is the first rain or no-rain period
is.na(groups) <- rain_tail | !groups
# replace `groups` values with NA if amount of rain is below threshold
# (here `groups` is used as a numeric index for `valid`)
is.na(groups) <- !valid[groups]
# add NA to match length of original vector and set names
setNames(c(NA_integer_, groups), rain)
}> significant_drydowns(rain, 5, 5)
0 0 0 2 3 0 0 0 1 5 0 0 0 0 0 0 6 1 1 1 0 0 0
NA NA NA NA NA 1 1 1 NA NA 2 2 2 2 2 2 NA NA NA NA 3 3 3
> significant_drydowns(rain, 7, 4)
0 0 0 2 3 0 0 0 1 5 0 0 0 0 0 0 6 1 1 1 0 0 0
NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 3 3 3Context
StackExchange Code Review Q#64608, answer score: 6
Revisions (0)
No revisions yet.