patternpythonMinor
Speeding up for-loop over a list
Viewed 0 times
loopforspeedinglistover
Problem
I have two lists with ca. 4000 elements where each each element have two columns. These are being fed into a function. Besides the function they are being fed into, is it possible to speed it up somehow? At this rate, it will take ca. 6-7 days to complete.
EDIT: further description, I have the polygons for each ind in year 1 and the coordinates for year 2. I want to see how many of the locations from year 2 that falls within the polygon from year 1 for each individual.
EDIT 2: the structure of
There are always either 48 or 56 obs per element level.
results=rep(0, length(as.numeric(unlist(coors))))
for(i in names(coors)){
print(i)
for( j in names(pols)){
results[i]=point.in.polygon(coors[[i]][,1], coors[[i]][,2], pols[[j]][,1], pols[[j]][,2])
}}EDIT: further description, I have the polygons for each ind in year 1 and the coordinates for year 2. I want to see how many of the locations from year 2 that falls within the polygon from year 1 for each individual.
EDIT 2: the structure of
coors str(coors)
List of 4052
$ 2225 :'data.frame': 48 obs. of 2 variables:
..$ cor.x: num [1:48] 635184 635215 635394 635431 635430 ...
..$ cor.y: num [1:48] 7151002 7151201 7151175 7151110 7151118 ...
$ 2226 :'data.frame': 56 obs. of 2 variables:
..$ cor.x: num [1:56] 635945 635936 635944 635969 635947 ...
..$ cor.y: num [1:56] 7152813 7152847 7152834 7152785 7152810 ...
$ 2227 :'data.frame': 56 obs. of 2 variables:
..$ cor.x: num [1:56] 636244 636245 636317 636450 636386 ...
..$ cor.y: num [1:56] 7151503 7151505 7151628 7151693 7151799 ...
$ 2228 :'data.frame': 56 obs. of 2 variables:
..$ cor.x: num [1:56] 636451 636418 636408 636467 636495 ...
..$ cor.y: num [1:56] 7152610 7152605 7152634 7152572 7152537 ...There are always either 48 or 56 obs per element level.
Pols have similar structure, but more variable lengths as the number of observations depends on the shape of the polygon.Solution
I assume you are using
Edit: Here is now an example of how you could code this into a single loop. It is possible you will have to work it a bit as you have not really shown us how you want to store your output.
First, some sample data:
Here, we concatenate all the points in
Now the single loop:
I am confident this will significantly speed up your computations. However, if this is still too slow, I agree you'll have to consider parallelization. Good luck.
point.in.polygon from the sp package. The function can take any number of points, so you should rewrite your algorithm as a single loop: for each polygon in pols, make a single call to point.in.polygon to check if all the points in coords are inside or not. This should save you a lot of time. Then you'll only have to do a little work reformatting the output. I can help with the code if you can please clarify what coors looks like.Edit: Here is now an example of how you could code this into a single loop. It is possible you will have to work it a bit as you have not really shown us how you want to store your output.
First, some sample data:
coors <- replicate(5, {n <- sample(5:10, 1);
data.frame(x = runif(n), y = runif(n))},
simplify = FALSE)
pols <- replicate(3, {n <- sample(5:10, 1)
data.frame(x = runif(n), y = runif(n))},
simplify = FALSE)Here, we concatenate all the points in
coors together, but keep a vector of group indices which we will use later for splitting by group:all.coors <- do.call(rbind, coors)
num.points <- sapply(coors, nrow)
group.idx <- rep(seq_along(coors), num.points)Now the single loop:
results <- vector("list", length(pols))
for (j in seq_along(pols)) {
print(j)
results[[j]] <- split(point.in.polygon(all.coors[,1], all.coors[,2],
pols[[j]][,1], pols[[j]][,2]),
group.idx)
}I am confident this will significantly speed up your computations. However, if this is still too slow, I agree you'll have to consider parallelization. Good luck.
Code Snippets
coors <- replicate(5, {n <- sample(5:10, 1);
data.frame(x = runif(n), y = runif(n))},
simplify = FALSE)
pols <- replicate(3, {n <- sample(5:10, 1)
data.frame(x = runif(n), y = runif(n))},
simplify = FALSE)all.coors <- do.call(rbind, coors)
num.points <- sapply(coors, nrow)
group.idx <- rep(seq_along(coors), num.points)results <- vector("list", length(pols))
for (j in seq_along(pols)) {
print(j)
results[[j]] <- split(point.in.polygon(all.coors[,1], all.coors[,2],
pols[[j]][,1], pols[[j]][,2]),
group.idx)
}Context
StackExchange Code Review Q#16118, answer score: 3
Revisions (0)
No revisions yet.