HiveBrain v1.2.0
Get Started
← Back to all entries
snippetpythonMinor

Long format data - fill episode based on conditional previous episode

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
formatpreviousepisodelongconditionalbaseddatafill

Problem

The data are organised as long format data. 4 individuals are observed during 4 or 5 days (BCSID is the name of the unique key). Basically, the data describe activities performed during these 4-5 days. START describe the start time of activities and MAINACT the activities.

The data:

```
data = structure(list(BCSID = c("B10001N", "B10001N", "B10001N", "B10001N",
"B10001N", "B10001N", "B10001N", "B10001N", "B10001N", "B10001N",
"B10001N", "B10001N", "B10001N", "B10001N", "B10001N", "B10001N",
"B10001N", "B10001N", "B10001N", "B10001N", "B10001N", "B10001N",
"B10001N", "B10001N", "B10001N", "B10001N", "B10001N", "B10001N",
"B10001N", "B10001N", "B10001N", "B10001N", "B10001N", "B10001N",
"B10001N", "B10001N", "B10001N", "B10001N", "B10001N", "B10001N",
"B10001N", "B10001N", "B10001N", "B10001N", "B10001N", "B10001N",
"B10001N", "B10001N", "B10001N", "B10001N", "B10001N", "B10001N",
"B10001N", "B10001N", "B10001N", "B10004R", "B10004R", "B10004R",
"B10004R", "B10004R", "B10004R", "B10004R", "B10004R", "B10004R",
"B10004R", "B10004R", "B10004R", "B10004R", "B10004R", "B10004R",
"B10004R", "B10004R", "B10004R", "B10004R", "B10004R", "B10004R",
"B10004R", "B10004R", "B10004R", "B10004R", "B10004R", "B10004R",
"B10004R", "B10004R", "B10004R", "B10004R", "B10004R", "B10004R",
"B10004R", "B10004R", "B10004R", "B10004R", "B10004R", "B10004R",
"B10004R", "B10004R", "B10004R", "B10004R", "B10004R", "B10004R",
"B10004R", "B10004R", "B10004R", "B10004R", "B10004R", "B10004R",
"B10004R", "B10004R", "B10004R", "B10004R", "B10004R", "B10004R",
"B10004R", "B10004R",

Solution

I would first write (or find) a function for shifting a vector x by a given number of observations k. The stat package has a lag function but it only allows to shift in one direction (k has to be >= 0)... Here is such a function that will work both ways, with positive or negative k:

LAG  0) {
      c(rep(NA, k), head(x, -k))
   } else {
      c(tail(x, k), rep(NA, -k))
   }   
}


Then, you can create a boolean vector telling if each row meets all the conditions or not:

need_replace <- with(data, eorder2 == 1 &
                           DAY != 1 &
                           MAINACT == '-11' &
                           LAG(MAINACT, +1) %in% c('1301', '1302') &
                           LAG(MAINACT, -1) == '1302')


And finally, do the substitution:

data$MAINACT[need_replace] <- '1606'


A few more comments:

  • I created a vector of TRUE/FALSE rather than a vector of indices like you did with which. Both work but it is less typing without which.



  • See that I used with(data, ...) so I did not have to type data$ over and over. This also makes your code shorter and easier to read.



  • I used %in% instead of two == statements separated by |. That's another good function to know (imagine having many more than two allowed values...)



  • Be careful that & has higher priority than | so what you had written was equivalent to statement1 | (statement2 & statement 3) which is not the same as what I think you had in mind: (statement1 | statement2) & statement3. Priority rules are documented under ?Syntax.



  • As it stands, none of the rows in your example data match all the conditions you have specified so please let me know if I misunderstood something, I am sure it will be a simple fix.

Code Snippets

LAG <- function(x, k) {
   if (k == 0) {
      x
   } else if (k > 0) {
      c(rep(NA, k), head(x, -k))
   } else {
      c(tail(x, k), rep(NA, -k))
   }   
}
need_replace <- with(data, eorder2 == 1 &
                           DAY != 1 &
                           MAINACT == '-11' &
                           LAG(MAINACT, +1) %in% c('1301', '1302') &
                           LAG(MAINACT, -1) == '1302')
data$MAINACT[need_replace] <- '1606'

Context

StackExchange Code Review Q#95594, answer score: 3

Revisions (0)

No revisions yet.