patternpythonMinor
Skyline problem solved by sweep line
Viewed 0 times
skylineproblemlinesweepsolved
Problem
The Skyline problem from UVa Online Judge is as follows:
You are to design a program to assist an architect in drawing the skyline of a city given the locations of the buildings in the city. A building is specified by an ordered triple \$ (L_i,H_i,R_i) \$ where \$ L_i \$ and \$ R_i \$ are left and right coordinates, respectively, of building \$ i \$ and \$ H_i \$ is the height of the building.
The input is a sequence of building triples. All coordinates of buildings are integers less than 10,000 and there will be at least one and at most 5,000 buildings in the input file. Each building triple is on a line by itself in the input file. All integers in a triple are separated by one or more spaces. The triples will be sorted by \$ L_i \$, the left x-coordinate of the building, so the building with the smallest left x-coordinate is first in the input file.
The output should consist of the skyline vector \$ (v_1, v_2, v_3, \ldots, v_{n−2}, v_{n−1}, v_n)\$, the \$ v_i \$ such that \$ i \$ is an even number represent a horizontal line (height). The \$ v_i \$ such that \$ i \$ is an odd number represent a vertical line (x-coordinate). The skyline vector should represent the “path” taken, for example, by a bug starting at the minimum x-coordinate and traveling horizontally and vertically over all the lines that define the skyline. Thus the last entry in all skyline vectors will be a 0.
This is my solution:
```
# SortedDict
from sortedcontainers import SortedDict # pip3 install sortedcontainers
from collections import deque
import heapq
def skyline(buildings_by_L): # a deque.
buildings_by_H = SortedDict() # a self-balancing binary tree.
buildings_by_R = [] # a priority queue.
skyline = []
def add_point(x):
h = buildings_by_H.iloc[-1] if buildings_by_H else 0
if not skyline:
skyline.append((x, h))
else:
x0, h0 = skyline[-1]
You are to design a program to assist an architect in drawing the skyline of a city given the locations of the buildings in the city. A building is specified by an ordered triple \$ (L_i,H_i,R_i) \$ where \$ L_i \$ and \$ R_i \$ are left and right coordinates, respectively, of building \$ i \$ and \$ H_i \$ is the height of the building.
The input is a sequence of building triples. All coordinates of buildings are integers less than 10,000 and there will be at least one and at most 5,000 buildings in the input file. Each building triple is on a line by itself in the input file. All integers in a triple are separated by one or more spaces. The triples will be sorted by \$ L_i \$, the left x-coordinate of the building, so the building with the smallest left x-coordinate is first in the input file.
The output should consist of the skyline vector \$ (v_1, v_2, v_3, \ldots, v_{n−2}, v_{n−1}, v_n)\$, the \$ v_i \$ such that \$ i \$ is an even number represent a horizontal line (height). The \$ v_i \$ such that \$ i \$ is an odd number represent a vertical line (x-coordinate). The skyline vector should represent the “path” taken, for example, by a bug starting at the minimum x-coordinate and traveling horizontally and vertically over all the lines that define the skyline. Thus the last entry in all skyline vectors will be a 0.
This is my solution:
```
# SortedDict
from sortedcontainers import SortedDict # pip3 install sortedcontainers
from collections import deque
import heapq
def skyline(buildings_by_L): # a deque.
buildings_by_H = SortedDict() # a self-balancing binary tree.
buildings_by_R = [] # a priority queue.
skyline = []
def add_point(x):
h = buildings_by_H.iloc[-1] if buildings_by_H else 0
if not skyline:
skyline.append((x, h))
else:
x0, h0 = skyline[-1]
Solution
- Bug
If the city contains two identical buildings then
skyline raises an exception:>>> skyline(deque([(1, 2, 3), (1, 2, 3)]))
Traceback (most recent call last):
File "", line 1, in
File "cr66776.py", line 97, in skyline
delete(heapq.heappop(buildings_by_R)[1])
File "cr66776.py", line 87, in delete
buildings_by_H[b[1]].remove(b)
KeyError: 2This is because the code represents buildings by tuples of numbers, and that means that two identical buildings get stored just once in a set, so when you come to remove the second building, it's not there. To fix this, buildings need to be represented such that each building is unique. One way to do this would be to use a
Building class, as objects are always unique unless they implement an __eq__ method.- Review
-
It's not clear to the reader how the data is represented. The variable
b represents a building, but what are b[0], b[1] and b[2]? These are the building's left x-coordinate, height, and right x-coordinate respectively, so it would make the code a lot easier to read if these were b.left, b.height, and b.right. It's easy to do this using a class, or collections.namedtuple.-
It's not clear exactly what the helper functions
add_point, insert, delete do. In particular, add_point doesn't always add a point to the skyline list: sometimes it updates the last point in the list. The insert function adds a building to the buildings_by_R and buildings_by_H collections, but also calls add_point. These functions could do with comments explaining what exactly it is that they do.-
Some of the operations could do with a comment to explain their purpose. For example here:
h = buildings_by_H.iloc[-1] if buildings_by_H else 0It would be worth noting that this finds the height of the highest building currently standing.
-
The
insert and delete helper functions are only called once, so there's no particular advantage to making them into functions. The code might be clearer if these functions were inlined at the point of use.-
The
insert function contains the following lines of code that manipulate the vaguely named sets parameter:def insert(b, sets=[set()]):
s = buildings_by_H.setdefault(b[1], sets[0])
s.add(b)
if s == sets[0]:
sets[0] = set()It looks as if there was a concern that if the code were written in the obvious way:
def insert(b):
buildings_by_H.setdefault(b[1], set()).add(b)then a
set() object would be created and immediately discarded each time a building was inserted at a position for which buildings_by_H already had an entry. This is a valid concern, but the following would seem to be a clearer way to handle it:def insert(b):
try:
s = buildings_by_H[b[1]]
except KeyError:
buildings_by_H[b[1]] = s = set()
s.add(b)-
It's worth making a little effort to avoid using libraries that are not built into Python. Having to install
sortedcontainers is a small but genuine barrier to running code (and might partly account for the time this question has languished without being reviewed). In this case, SortedDict is used to maintain the collection of currently standing buildings, such that (i) the height of the highest building can be found via buildings_by_H.iloc[-1]; and (ii) the set of buildings at a given height can by located via buildings_by_H[b[1]] and so a building can be removed.However, we can implement (i) by storing the buildings in a heap, because as it says in the
heapq documentation:The interesting property of a heap is that its smallest element is always the root,
heap[0].Python heaps are min-heaps but we want the highest building, so we have to reverse the comparison order on buildings. If buildings are represented by objects, as recommended in §1 above, then this can be done by implementing a
__lt__ method on the Building class.How do we implement (ii)? You can't easily delete elements from a heap. The insight here is that we don't have to delete the building from the heap, we can leave it in the heap but mark it as finished (if buildings are represented by objects, as recommended in §1, above then this can be done by setting an attribute). We only care about the highest standing building, which is at the root of the heap, so we can just check that one to see if it is finished and pop it if so.
-
There's a lot of complexity in the
add_point function. This complexity is due to the fact that when add_point(x) is called, we don't know whether or not we've processed all the buildings that start or end at x, so when we construct a point (x, h) we don't know whether we're done with it, or whether we might have to revise it later.It would simplify this code considerably if we could be sure that all the buildings that start or end at
x have been processed. Then we would know that the point would not need to be revised, and we could just `yieCode Snippets
>>> skyline(deque([(1, 2, 3), (1, 2, 3)]))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "cr66776.py", line 97, in skyline
delete(heapq.heappop(buildings_by_R)[1])
File "cr66776.py", line 87, in delete
buildings_by_H[b[1]].remove(b)
KeyError: 2h = buildings_by_H.iloc[-1] if buildings_by_H else 0def insert(b, sets=[set()]):
s = buildings_by_H.setdefault(b[1], sets[0])
s.add(b)
if s == sets[0]:
sets[0] = set()def insert(b):
buildings_by_H.setdefault(b[1], set()).add(b)def insert(b):
try:
s = buildings_by_H[b[1]]
except KeyError:
buildings_by_H[b[1]] = s = set()
s.add(b)Context
StackExchange Code Review Q#66776, answer score: 6
Revisions (0)
No revisions yet.