Friday, November 6, 2009

A fast SAT solver

A decade ago, I developed a system for propositional logic, based on Prime Normal Forms. The main function takes an arbitrary propositional formula φ and returns its prime conjunctive normal form pcnf(φ). Implicitly, this algorithm solves the SAT problem, i.e. it provides a general decision method for the question, if a given formula φ is satisfiable or not, namely:
φ is satisfiable iff pcnf(φ) is not (the normal form of) 0, i.e. "false".

Obviously, the SAT problem is one of the hot issues in computer science and there is a demand for a fast algorithm. However, it has never been the focus of my own research project, which is rather dealing with a re-interpretation of modern logic. I just needed a system that provided me with the functionality of propositional logic and at that time, I didn't know of any available one. I suggested however, that the solution I found would satisfy a general demand and that my approach tackled some very deep insights into the matter. I published the mathematical theory in a paper. I also wrote a Java applet that works like an online pocket calculator for propositional logic and accompanied it with a couple of tutorials and introductions for all kinds of users.

Sketch of the method

In my publications I rather use the dual as the default, i.e. I consider Prime Disjunctive Normal Forms, the function is pdnf instead of pcnf, and the satisfiability problem becomes the validity problem. The algorithm for the pdnf function is not stochastic or heuristic in nature, it is a strictly deterministic and algebraic procedure. I'll try to sketch its basic features, but let me recall some (more or less) standard terminology and well-known facts, first:

  • A literal λ is either an atomic or a negated atomic formula, i.e. α or ¬α.

  • A normal literal conjunction or NLC γ is a conjunction of literals [λ1 ∧ ... ∧ λk] so that the atoms α1, ..., αk occuring in these literals are strict linearly ordered, according to some given linear order relation < on the chosen set of atoms. Each λi is a component of γ.

  • A disjunctive normal form or DNF Δ is a disjunction of NLC's [γ1∨...∨γn]. Each γi is a component of Δ. We all know, that each formula φ has an equivalent DNF Δ, written φ⇔Δ.

  • Given a NLC γ=[λ1 ∧ ... ∧ λk] and a DNF Δ=[γ1∨...∨γn]. We say that

    • γ is a factor of Δ, if γ implies (or is subvalent to) Δ, written γ⇒Δ.

    • γ is a prime factor of Δ, if it is a factor and none of its components λ1, ..., λk could be deleted without violating the subvalence γ⇒Δ.

  • A DNF Δ=[γ1∨...∨γn] is called a

    • prime DNF or PDNF, if the set of its components {γ1, ..., γn} is exactly the set of all its prime factors.

    • minimal DNF or MDNF, if there is no other equivalent DNF which is smaller in size. (The size of a DNF is the number of components and atom occurrences.)

  • Every propositional formula φ has an equivalent PDNF. This PDNF is unique (up to the order of its components). So the function pdnf that returns the equivalent PDNF pdnf(φ) for every given φ is a well-defined canonization of propositional logic.

  • Every φ also has an equivalent MDNF. But this MDNF is not unique in general. Is is however always a subset of the PDNF in the sense that each component of the MDNF must be a component of the PDNF.

Our goal is an implementation of the pdfn function, i.e. the construction of an equivalent PNDF Δ for each given formula φ. The real core of this function is the P-Procedure, which takes an arbitrary DNF Δ and returns the equivalent PDNF P-Procedure(Δ). A classical method to implement the P-Procedure is the Quine-McCluskey method. But that algorithm grows exponentially and is not feasible for other than small input DNF's. We need something else and we start with the idea of pairwise component minimalization and call this the M-Procedure:

  1. We take two components γL and γR of the given Δ and replace it by the components of, i.e. the MDNF [μ1∨...∨μm] of [γL∨γR]. Obviously, m is either 1 or 2, so this step can only decrease the size of Δ.

  2. We repeat the first step until no more changes can be applied.

The resulting DNF, denoted by M-Procedure(Δ), is what we call a pairwise minimal DNF or M2DNF, i.e. a DNF where each pair of components make a minimal DNF. It is easy to proof that

  • each PDNF is a M2DNF, and

  • each MDNF is a M2DNF.

But none of these two facts holds the other way round. M-Procedure(Δ) is neither the prime nor a minimal form of Δ, at least not in general. The M-Procedure is not a realization of the P-Procedure (hence the two different names). But it will serve us well in a proper implementation of the P-Procedure...

I suppose, that most people who spent some time and concentration on the SAT problem have tried this approach of an M-Procedure. It is not a trivial matter to understand why this has to fail. The notion of prime in propositional logic is probably motivated by the according concept in number theory. But a closer investigation of things reveals a surprising and fundamental difference between prime factors in propositional formulas and integers. This problem, but also its solution, stems from the analysis of binary DNF's [γL∨γR].

For every two NLC's γL and γR we write

  • minLR) for the MDNF of [γL∨γR], and

  • primLR) for the PDNF of [γL∨γR]

These functions min and prim have straight-forward implementations (of linear complexity) and they are not hard to explain. What is actually an interesting and crucial point here is the fact that

  • minLR) is made of either one or two components, as mentioned earlier,

  • primLR) is often the same as minLR), but there is also a situation where minLR)=[γL∨γR] and primLR)=[γL∨γR∨γc] is a 3-component DNF. For example, consider

    prim([AB], [¬BC]) = [[AB] ∨ [¬BC] ∨ [AC]]

    This third and new γc is what we call the c-prime.

Now we are able to implement the P-Procedure:

Algorithm P-Procedure(Δ)
Δ' := M-Procedure(Δ) ;
(1.) Δ'' := Δ' ;
(2.) let Π be the set of all c-primes of component pairs
in Δ' ;
(3.) attach all the components of Π to Δ' ;
(4.) Δ' := M-Procedure(Δ') ;
until Δ' and Δ'' contain the same set of components ;
return Δ' ;

The proof for the correctness of this P-Procedure is based on a deep result of what I called Completeness Theorem, saying that a DNF is a PDNF iff it is a c-complete M2DNF.

For its computational complexity holds: If n is the number of different atoms in Δ, then the P-Procedure needs no more than n repeat loops. This, together with the fact that the M-Procedure is of polynomial complexity, let me suggest that the P-Procedure is of polynomial complexity as well. And that, of course, would have been a suprising answer to the open P=NP problem. When I realized that, I spent some time to find evidence for or against my conjecture, but I was only able to deliver some lemmata and partial proofs, but no definite decision.


All mentioned material is available on In particular: