principlepythonModerate
N-Queens, bitwise approach
Viewed 0 times
bitwiseapproachqueens
Problem
I've recently solved the famous N-Queens problem:
Can you place \$N\$ chess queens on a \$N × N\$ board in such a way that no one
attacks another queen?
The idea here is based on the ideas from several blogposts describing a bitwise approach - approaching the board row by row, keeping track of bits of 3 numbers - columns, major and minor diagonals:
The solution was accepted by the LeetCode OJ and actually beats 98.97% of Python solutions. But, can we optimize it even further? What would you improve code-quality/readability wise?
Another follow-up concern is that I don't see an easy way for this problem to instead of counting possible board arrangements, collect all possible distinct board configurations.
Can you place \$N\$ chess queens on a \$N × N\$ board in such a way that no one
attacks another queen?
The idea here is based on the ideas from several blogposts describing a bitwise approach - approaching the board row by row, keeping track of bits of 3 numbers - columns, major and minor diagonals:
class Solution(object):
def totalNQueens(self, n):
all_ones = 2 ** n - 1
def helper(ld, column, rd, count=0):
if column == all_ones: # filled out all vacant positions
return 1
possible_slots = ~(ld | column | rd) # mark possible vacant slots as 1s
while possible_slots & all_ones:
current_bit = possible_slots & -possible_slots # get first 1 from the right
possible_slots -= current_bit # occupy a slot
# mark conflicts in the next row
next_column_bit = column | current_bit
next_rd_bit = (rd | current_bit) > 1
count += helper(next_ld_bit, next_column_bit, next_rd_bit)
return count
return helper(0, 0, 0)The solution was accepted by the LeetCode OJ and actually beats 98.97% of Python solutions. But, can we optimize it even further? What would you improve code-quality/readability wise?
Another follow-up concern is that I don't see an easy way for this problem to instead of counting possible board arrangements, collect all possible distinct board configurations.
Solution
Micro-optimizing Python is a bit of a quixotic endeavour — if you wanted the fastest possible performance then you wouldn't be using Python. But so long as we understand the nature of the exercise, then it's just a matter of trying lots of changes and measuring the results to see which ones are improvements:
Here's the baseline measurement for comparison with the improvements:
-
Move the mask against
write:
This saves about 6%:
-
Accumulate
This saves another 3% or so:
-
Avoid the local variables
This saves roughly another 4%:
That's about 13% improvement in all.
Here's the baseline measurement for comparison with the improvements:
>>> from timeit import timeit
>>> timeit(lambda:totalNQueens(13), number=1)
4.241301966001629-
Move the mask against
all_ones out of the loop. That is, instead of:possible_slots = ~(ld | column | rd)
while possible_slots & all_ones:write:
possible_slots = ~(ld | column | rd) & all_ones
while possible_slots:This saves about 6%:
>>> timeit(lambda:totalNQueens(13), number=1)
3.999626944991178-
Accumulate
count via a variable local to totalNQueens instead of returning it from helper:def totalNQueens(n):
all_ones = 2 ** n - 1
count = 0
def helper(ld, column, rd):
nonlocal count
if column == all_ones:
count += 1
return
possible_slots = ~(ld | column | rd) & all_ones
while possible_slots:
current_bit = possible_slots & -possible_slots
possible_slots -= current_bit
next_column_bit = column | current_bit
next_rd_bit = (rd | current_bit) > 1
helper(next_ld_bit, next_column_bit, next_rd_bit)
helper(0, 0, 0)
return countThis saves another 3% or so:
>>> timeit(lambda:totalNQueens(13), number=1)
3.8595934879995184-
Avoid the local variables
next_column_bit and so on:def totalNQueens(n):
all_ones = 2 ** n - 1
count = 0
def helper(ld, column, rd):
nonlocal count
if column == all_ones:
count += 1
return
possible_slots = ~(ld | column | rd) & all_ones
while possible_slots:
current_bit = possible_slots & -possible_slots
possible_slots -= current_bit
helper((ld | current_bit) >> 1,
column | current_bit,
(rd | current_bit) << 1)
helper(0, 0, 0)
return countThis saves roughly another 4%:
>>> timeit(lambda:totalNQueens(13), number=1)
3.706758002997958That's about 13% improvement in all.
Code Snippets
>>> from timeit import timeit
>>> timeit(lambda:totalNQueens(13), number=1)
4.241301966001629possible_slots = ~(ld | column | rd)
while possible_slots & all_ones:possible_slots = ~(ld | column | rd) & all_ones
while possible_slots:>>> timeit(lambda:totalNQueens(13), number=1)
3.999626944991178def totalNQueens(n):
all_ones = 2 ** n - 1
count = 0
def helper(ld, column, rd):
nonlocal count
if column == all_ones:
count += 1
return
possible_slots = ~(ld | column | rd) & all_ones
while possible_slots:
current_bit = possible_slots & -possible_slots
possible_slots -= current_bit
next_column_bit = column | current_bit
next_rd_bit = (rd | current_bit) << 1
next_ld_bit = (ld | current_bit) >> 1
helper(next_ld_bit, next_column_bit, next_rd_bit)
helper(0, 0, 0)
return countContext
StackExchange Code Review Q#159946, answer score: 10
Revisions (0)
No revisions yet.