principlepythonModerate

N-Queens, bitwise approach

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

bitwiseapproachqueens

Problem

I've recently solved the famous N-Queens problem:

Can you place \$N\$ chess queens on a \$N × N\$ board in such a way that no one
attacks another queen?

The idea here is based on the ideas from several blogposts describing a bitwise approach - approaching the board row by row, keeping track of bits of 3 numbers - columns, major and minor diagonals:

class Solution(object):
    def totalNQueens(self, n):
        all_ones = 2 ** n - 1

        def helper(ld, column, rd, count=0):
            if column == all_ones:  # filled out all vacant positions
                return 1

            possible_slots = ~(ld | column | rd)  # mark possible vacant slots as 1s
            while possible_slots & all_ones:
                current_bit = possible_slots & -possible_slots  # get first 1 from the right
                possible_slots -= current_bit  # occupy a slot

                # mark conflicts in the next row
                next_column_bit = column | current_bit
                next_rd_bit = (rd | current_bit) > 1

                count += helper(next_ld_bit, next_column_bit, next_rd_bit)
            return count

        return helper(0, 0, 0)

The solution was accepted by the LeetCode OJ and actually beats 98.97% of Python solutions. But, can we optimize it even further? What would you improve code-quality/readability wise?

Another follow-up concern is that I don't see an easy way for this problem to instead of counting possible board arrangements, collect all possible distinct board configurations.

Solution

Micro-optimizing Python is a bit of a quixotic endeavour — if you wanted the fastest possible performance then you wouldn't be using Python. But so long as we understand the nature of the exercise, then it's just a matter of trying lots of changes and measuring the results to see which ones are improvements:

Here's the baseline measurement for comparison with the improvements:

>>> from timeit import timeit
>>> timeit(lambda:totalNQueens(13), number=1)
4.241301966001629

-
Move the mask against all_ones out of the loop. That is, instead of:

possible_slots = ~(ld | column | rd)
while possible_slots & all_ones:

write:

possible_slots = ~(ld | column | rd) & all_ones
while possible_slots:

This saves about 6%:

>>> timeit(lambda:totalNQueens(13), number=1)
3.999626944991178

-
Accumulate count via a variable local to totalNQueens instead of returning it from helper:

def totalNQueens(n):
    all_ones = 2 ** n - 1
    count = 0

    def helper(ld, column, rd):
        nonlocal count
        if column == all_ones:
            count += 1
            return
        possible_slots = ~(ld | column | rd) & all_ones
        while possible_slots:
            current_bit = possible_slots & -possible_slots
            possible_slots -= current_bit
            next_column_bit = column | current_bit
            next_rd_bit = (rd | current_bit) > 1
            helper(next_ld_bit, next_column_bit, next_rd_bit)

    helper(0, 0, 0)
    return count

This saves another 3% or so:

>>> timeit(lambda:totalNQueens(13), number=1)
3.8595934879995184

-
Avoid the local variables next_column_bit and so on:

def totalNQueens(n):
    all_ones = 2 ** n - 1
    count = 0

    def helper(ld, column, rd):
        nonlocal count
        if column == all_ones:
            count += 1
            return
        possible_slots = ~(ld | column | rd) & all_ones
        while possible_slots:
            current_bit = possible_slots & -possible_slots
            possible_slots -= current_bit
            helper((ld | current_bit) >> 1,
                   column | current_bit,
                   (rd | current_bit) << 1)

    helper(0, 0, 0)
    return count

This saves roughly another 4%:

>>> timeit(lambda:totalNQueens(13), number=1)
3.706758002997958

That's about 13% improvement in all.

Code Snippets

>>> from timeit import timeit
>>> timeit(lambda:totalNQueens(13), number=1)
4.241301966001629

possible_slots = ~(ld | column | rd)
while possible_slots & all_ones:

possible_slots = ~(ld | column | rd) & all_ones
while possible_slots:

>>> timeit(lambda:totalNQueens(13), number=1)
3.999626944991178

def totalNQueens(n):
    all_ones = 2 ** n - 1
    count = 0

    def helper(ld, column, rd):
        nonlocal count
        if column == all_ones:
            count += 1
            return
        possible_slots = ~(ld | column | rd) & all_ones
        while possible_slots:
            current_bit = possible_slots & -possible_slots
            possible_slots -= current_bit
            next_column_bit = column | current_bit
            next_rd_bit = (rd | current_bit) << 1
            next_ld_bit = (ld | current_bit) >> 1
            helper(next_ld_bit, next_column_bit, next_rd_bit)

    helper(0, 0, 0)
    return count

Context

StackExchange Code Review Q#159946, answer score: 10

Revisions (0)

No revisions yet.