patternpythonMinor
Returning an existing data frame with four new columns
Viewed 0 times
newcolumnswithfourdatareturningexistingframe
Problem
I'm trying to implement a function that given a data frame returns the same data frame with four columns added. These new four columns are: for each row, I get the maximum element and its index and put them as two new columns. I do the same with the second maximum element. I don't care if they are repeated.
I'm sure this code can be improved. What do you recommend in order to do that? Is it fast enough?
add_2max <- function(x)
{
max1 = max(x, na.rm=TRUE)
indmax1 = which.max(x)
y=x[-c(indmax1)]
max2 = max(y, na.rm=TRUE)
indmax2 = which(x==max2)
indmax2 = ifelse(max1==max2, indmax2[2], indmax2[1])
x=c(x, max1, max2, indmax1, indmax2)
return (x)
}
add_2max_df <- function(DF)
{
NewDF=t(apply(DF, 1, add_2max))
return(NewDF)
}I'm sure this code can be improved. What do you recommend in order to do that? Is it fast enough?
Solution
Here's a faster way:
add_2maxFaster = imax1) imax2 <- imax2 + 1L
c(x, x[imax1], x[imax2], imax1, imax2)
}
set.seed(42)
m <- matrix(runif(1e6), 1e4)
# Compare speed:
system.time( a1<-apply(m, 1, add_2max) ) # 0.38 secs
system.time( a2<-apply(m, 1, add_2maxFaster) ) # 0.15 secs
# ...And compare results
all.equal(a1,a2) # TRUECode Snippets
add_2maxFaster <- function(x)
{
imax1 <- which.max(x)
imax2 <- which.max(x[-imax1])
if (imax2 >= imax1) imax2 <- imax2 + 1L
c(x, x[imax1], x[imax2], imax1, imax2)
}
set.seed(42)
m <- matrix(runif(1e6), 1e4)
# Compare speed:
system.time( a1<-apply(m, 1, add_2max) ) # 0.38 secs
system.time( a2<-apply(m, 1, add_2maxFaster) ) # 0.15 secs
# ...And compare results
all.equal(a1,a2) # TRUEContext
StackExchange Code Review Q#10792, answer score: 5
Revisions (0)
No revisions yet.