snippetMinor
What are efficient approaches to implement unit propagation in DPLL-based SAT solvers?
Viewed 0 times
satimplementwhataresolversefficientbasedapproachespropagationunit
Problem
I'm trying to decompose deduction steps of DPLL algorithm -- unit propagation and pure literal elimination -- for parallelization. However, I want a baseline and asymptotic analysis to compare to my parallel approach for these two deduction procedures.
My naive approach to unit propagation would be to perform linear scan on all clauses, and that of pure literal elimination is again to linear scan and count the number of + and - polarities. But, it wouldn't make sense to compare my algorithm to the most naive approach available.
After reading this paper -- "The Quest for Efficient Boolean Satisfiability Solvers" -- I came to know of two more approaches:
But I believe there can be better data structures to represent the input formula and thus better algorithms and complexity analyses.
I'm only concerned with the approaches to deduce assignments at the beginning of DPLL procedure pseudocode as given on Wikipedia's page, and not branching heuristics or CDCL.
Are there any resources and comparisons of approaches to implement unit propagation (and possibly PLE) in DPLL? Or is two-watched literals still the most popular/ efficient technique till date?
My naive approach to unit propagation would be to perform linear scan on all clauses, and that of pure literal elimination is again to linear scan and count the number of + and - polarities. But, it wouldn't make sense to compare my algorithm to the most naive approach available.
After reading this paper -- "The Quest for Efficient Boolean Satisfiability Solvers" -- I came to know of two more approaches:
- Head-tail lists
- Two-watched literals (which disables pure literal search)
But I believe there can be better data structures to represent the input formula and thus better algorithms and complexity analyses.
I'm only concerned with the approaches to deduce assignments at the beginning of DPLL procedure pseudocode as given on Wikipedia's page, and not branching heuristics or CDCL.
Are there any resources and comparisons of approaches to implement unit propagation (and possibly PLE) in DPLL? Or is two-watched literals still the most popular/ efficient technique till date?
Solution
Theory
Unit Propagation
For unit propagation, two-watched literals with circular updates is asymptotically optimal. I suggest reading "Optimal Implementation of Watched Literals and More General Techniques" which generalizes the concept and proves optimality. (It can also be helpful to read "An Efficient Algorithm for Unit Propagation" to see why head/tail meets the same optimality, though with larger storage costs and backtracking costs.)
When a watcher is falsified, it must find some unassigned literal in the clause to replace it. (If it cannot, then in a two-watcher scheme the remaining watcher has become unit; in a one-watcher scheme the clause has become empty.) Critically, when doing this search for a replacement literal one must begin where the last search ended, treating the clause as a circular buffer.
When done this way, updating watchers takes constant time when amortized across the search tree. Therefore unit resolution takes amortized linear time. (Older solvers did not do this circular tracking, and instead always began the search from the start of the clause; this leads to quadratic worst-case complexity instead.)
When backtracking, watched literals require no updates.
Pure Literal Elimination
For pure literal elimination, after unit propagation we need to determine all unassigned literals with zero instances of a polarity. Polarity count can only change when a clause transitions between satisfied and unsatisfied, so when a literal is assigned/unassigned we must necessarily scan over all the clauses it occurs in (not just watches, but occurs) and update polarity counts. Care must be taken not to double-count clauses, but updating literals in a clause is already linear so this doesn't affect asymptotic worst case.
When a literal has zero instances of a polarity, it may be assigned. No unit propagation is necessary, since a pure literal has no clauses to resolve with.
When backtracking, polarity counts do require updates.
Conditional Autarkies
As an aside, the kind of pure literals DPLL seeks to find are actually a special case of a more general concept called conditional autarkies.
An autarky
Given a conditional autarky for
Pure literals are just a special case of autarky, satisfying every clause they touch. If the literal is only pure after assigning some variables then that is a conditional pure literal, a special case of a conditional autarky.
Practice
Unit Propagation
Modern solvers spend a bulk of their time doing unit propagation, so this is a highly-optimized section of code. Although the worst case for unit resolution is amortized linear, in practice the number of clauses updated is less than this. With watched literals you might even hit a best case of not being a watcher for any clauses! That's one of the reasons why this is so efficient in practice.
To optimize storage, a clause will store the two watchers as the first two literals in its elements list. This partitions the clause into the watcher half (size 2) and the remainder half. This allows the search for a new watcher to never consider the other watcher as viable.
Memory is slow, so cache misses are a significant cause of latency in watchlist updates. Iterating a watchlist is fast, since it is contiguous and the prefetcher does its thing well. But to actually iterate the literals of a clause (as well as retrieve the clause search position for circular updates) requires an indirection which is almost certainly going to be a cache miss.
To remedy this, each watch in a watchlist stores a cached literal from the clause called a 'blocking literal'. If this literal is satisfied, we don't need to do anything and move on. This check is likely to hit cache and be ~free. Otherwise we go ahead and perform the indirection to find a new watcher. (And the blocking literal will be updated to a more likely useful literal as a side effect.)
The next optimization is to note that for binary clauses, the watcher along with the blocking literal is enough space for the entire clause. No indirection is required at all and no clause ID is needed, provided we know this is a binary clause. To do that, we declare that our solver doesn't support more than 2^31 literals. This lets us store literals in a u32 with the MSB free for other uses; in our case we will set this MSB if the clause is binary.
That is, if the MSB is set then this value (with the MSB cleared) is the other literal of a binary clause, else this value is a blocking literal for a long clause and the clause ID is the next u32 after it. The result is that we get a watchlist containing variable-len
Unit Propagation
For unit propagation, two-watched literals with circular updates is asymptotically optimal. I suggest reading "Optimal Implementation of Watched Literals and More General Techniques" which generalizes the concept and proves optimality. (It can also be helpful to read "An Efficient Algorithm for Unit Propagation" to see why head/tail meets the same optimality, though with larger storage costs and backtracking costs.)
When a watcher is falsified, it must find some unassigned literal in the clause to replace it. (If it cannot, then in a two-watcher scheme the remaining watcher has become unit; in a one-watcher scheme the clause has become empty.) Critically, when doing this search for a replacement literal one must begin where the last search ended, treating the clause as a circular buffer.
When done this way, updating watchers takes constant time when amortized across the search tree. Therefore unit resolution takes amortized linear time. (Older solvers did not do this circular tracking, and instead always began the search from the start of the clause; this leads to quadratic worst-case complexity instead.)
When backtracking, watched literals require no updates.
Pure Literal Elimination
For pure literal elimination, after unit propagation we need to determine all unassigned literals with zero instances of a polarity. Polarity count can only change when a clause transitions between satisfied and unsatisfied, so when a literal is assigned/unassigned we must necessarily scan over all the clauses it occurs in (not just watches, but occurs) and update polarity counts. Care must be taken not to double-count clauses, but updating literals in a clause is already linear so this doesn't affect asymptotic worst case.
When a literal has zero instances of a polarity, it may be assigned. No unit propagation is necessary, since a pure literal has no clauses to resolve with.
When backtracking, polarity counts do require updates.
Conditional Autarkies
As an aside, the kind of pure literals DPLL seeks to find are actually a special case of a more general concept called conditional autarkies.
An autarky
α is a (partial) assignment that satisfies all clauses it touches. A conditional autarky is an autarky with an assumption, i.e. if α_aut is an autarky for F ∧ α_con, then α_con ∪ α_aut is a conditional autarky.Given a conditional autarky for
F, the formula F ^ (α_con → α_aut) is not implied but is equisatisfiable. Note that the conditional part may be empty, so vanilla autarkies may also be assumed.Pure literals are just a special case of autarky, satisfying every clause they touch. If the literal is only pure after assigning some variables then that is a conditional pure literal, a special case of a conditional autarky.
Practice
Unit Propagation
Modern solvers spend a bulk of their time doing unit propagation, so this is a highly-optimized section of code. Although the worst case for unit resolution is amortized linear, in practice the number of clauses updated is less than this. With watched literals you might even hit a best case of not being a watcher for any clauses! That's one of the reasons why this is so efficient in practice.
To optimize storage, a clause will store the two watchers as the first two literals in its elements list. This partitions the clause into the watcher half (size 2) and the remainder half. This allows the search for a new watcher to never consider the other watcher as viable.
Memory is slow, so cache misses are a significant cause of latency in watchlist updates. Iterating a watchlist is fast, since it is contiguous and the prefetcher does its thing well. But to actually iterate the literals of a clause (as well as retrieve the clause search position for circular updates) requires an indirection which is almost certainly going to be a cache miss.
To remedy this, each watch in a watchlist stores a cached literal from the clause called a 'blocking literal'. If this literal is satisfied, we don't need to do anything and move on. This check is likely to hit cache and be ~free. Otherwise we go ahead and perform the indirection to find a new watcher. (And the blocking literal will be updated to a more likely useful literal as a side effect.)
The next optimization is to note that for binary clauses, the watcher along with the blocking literal is enough space for the entire clause. No indirection is required at all and no clause ID is needed, provided we know this is a binary clause. To do that, we declare that our solver doesn't support more than 2^31 literals. This lets us store literals in a u32 with the MSB free for other uses; in our case we will set this MSB if the clause is binary.
That is, if the MSB is set then this value (with the MSB cleared) is the other literal of a binary clause, else this value is a blocking literal for a long clause and the clause ID is the next u32 after it. The result is that we get a watchlist containing variable-len
Code Snippets
[Other][Blocking][Clause ID][Blocking][Clause ID][Other][Other]etc.Context
StackExchange Computer Science Q#150557, answer score: 3
Revisions (0)
No revisions yet.