Circumspect descent prevails in solving random constraint satisfaction problems
 *Department of Engineering Physics, Helsinki University of Technology, P.O. Box 1100, FI02015, Espoo, Finland;
 ^{†}Swedish Institute of Computer Science AB, SE164 29 Kista, Sweden;
 ^{‡}Department of Computational Biology, AlbaNova University Centre, SE106 91 Stockholm, Sweden;
 ^{§}Linnaeus Centre, KTHRoyal Institute of Technology, SE100 44 Stockholm, Sweden;
 ^{‖}Helsinki Institute for Information Technology (HIIT), Department of Computer Science, University of Helsinki, P.O.Box 68, FI00014, Helsinki, Finland;
 **School of Information and Communication Technology, KTHRoyal Institute of Technology, SE164 40 Kista, Sweden; and
 ^{††}Department of Information and Computer Science, Helsinki University of Technology, P.O. Box 5400, FI02015, Espoo, Finland
See allHide authors and affiliations

Edited by Giorgio Parisi, University of Rome, Rome, Italy, and approved August 22, 2008 (received for review January 16, 2008)
Abstract
We study the performance of stochastic local search algorithms for random instances of the Ksatisfiability (KSAT) problem. We present a stochastic local search algorithm, ChainSAT, which moves in the energy landscape of a problem instance by never going upwards in energy. ChainSAT is a focused algorithm in the sense that it focuses on variables occurring in unsatisfied clauses. We show by extensive numerical investigations that ChainSAT and other focused algorithms solve large KSAT instances almost surely in linear time, up to high clausetovariable ratios α; for example, for K = 4 we observe lineartime performance well beyond the recently postulated clustering and condensation transitions in the solution space. The performance of ChainSAT is a surprise given that by design the algorithm gets trapped into the first local energy minimum it encounters, yet no such minima are encountered. We also study the geometry of the solution space as accessed by stochastic local search algorithms.
Constraint satisfaction problems (CSPs) are the industrial, commercial, and often very largescale analogues of popular leisuretime pursuits such as the Sudoku puzzle. They can be formulated abstractly in terms of N variables x_{1}, x_{2},…, x_{N} and M constraints, where each variable x_{i} takes a value in a finite set, and each constraint forbids certain combinations of values to the variables. The classical example of a worstcase intractable (1) constraint satisfaction problem is the Ksatisfiability (KSAT) problem (2), where each variable takes a Boolean value (either 0 or 1), and each constraint is a clause over K variables disallowing one out of the 2^{K} possible combinations of values. An instance of KSAT can also be interpreted directly as a spin system of statistical physics. Each constraint equals to a Kspin interaction in a Hamiltonian, and thus spins represent the original variables; ground states of the Hamiltonian at zero energy correspond to the solutions, that is, assignments of values to the variables that satisfy all of the clauses (3).
It was first observed in the context of KSAT, and then in the context of several other CSPs (4), that ensembles of random CSPs have a “phase transition,” a sharp change in the likelihood to be solvable (5). Empirically, algorithms have been observed to fail or have difficulties in the immediate neighborhood of such phasetransition points, a fact that has given rise to a large literature (4). Large unstructured CSPs are solved either by generalpurpose deterministic methods, of which the archetypal example is the Davis–Putnam–Logemann–Loveland (DPLL) algorithm (6) or using more tailored algorithms, such as the Survey Propagation (SP) algorithm (7) motivated by spin glass theory, or variants of stochastic local search techniques (8–10).
Stochastic local search (SLS) methods are competitive on some of the largest and leaststructured problems of interest (11), in particular on random KSAT instances, which are constructed by selecting independently and uniformly at random M clauses over the N variables, where the parameter controlling the satisfiability of an instance is α = M/N, the ratio of clauses to variables. SLS algorithms work by making successive random changes to a trial configuration (assignment of values to the variables) based on information about a local neighborhood in the set of all possible configurations. Their modern history starts with the celebrated simulated annealing algorithm of Kirkpatrick, Gelatt, and Vecchi (12). From the perspective of KSAT, the next fundamental step forward was an algorithm of Papadimitriou (13), now often called RandomWalkSAT, which introduced the notion of focusing the random moves to rectify broken constraints. RandomWalkSAT has been shown, by simulation and theoretical arguments, to solve the paradigmatic case of random 3satisfiability up to about α = 2.7 clauses per variable, almost surely in time linear in N (14, 15). A subsequent influential development occurred with Selman, Kautz, and Cohen's WalkSAT algorithm (16), which mixes focused random and greedy moves for better performance. We have previously shown that WalkSAT and several other stochastic local search heuristics work almost surely in linear time, up to at least α = 4.21 clauses per variable (17–19). In comparison, the satisfiability/unsatisfiability threshold of random 3satisfiability is believed to be at α = 4.267 clauses per variable (20).
The present work carries out a systematic empirical study of random KSAT for K = 4, and we also present extensive data for K = 5 and K = 6. Our motivation for this study is 3fold.
Testing the Limits of Local Search.
It has been empirically observed for K = 3 that many SLS algorithms have a lineartime regime, which extends to the immediate vicinity of the phase transition point (17–19). Thus, a similar investigation for higher K is warranted.
Structure of the Space of Solutions.
Recent rigorous results and nonrigorous predictions from spinglass theory suggest that the structure of the space of solutions of a random KSAT instance undergoes various qualitative changes for K ≥ 4, the implications of which to the performance of algorithms should be investigated.
Mézard, Mora, and Zecchina (21) have shown rigorously that for K ≥ 8, the space of solutions of random KSAT breaks into multiple clusters separated by extensive Hamming distance. (The Hamming distance of two Boolean vectors of length N is the number positions in which the vectors differ divided by N.) In more precise terms, an instance of KSAT is xsatisfiable if it has a pair of solutions with normalized Hamming distance 0 ≤ x ≤ 1. Mézard, Mora, and Zecchina (21) show that, for K ≥ 8, there exists an interval (a, b), 0 < a < b < 1/2, such that, with high probability as N→∞, a random instance ceases to be xsatisfiable for all x ∈ (a, b) at a smaller value of α before it ceases to be xsatisfiable for some x ∈ [b, 1/2].
For K = 4, we see no evidence of gaps in the empirical xsatisfiability spectrum in the lineartime regime of SLS algorithms, which includes the predicted spinglass theoretic clustering points. In light of the rigorous results for K ≥ 8, this suggests that the cases K = 4 and K = 8 may be qualitatively different. Moreover, we observe that recently predicted spinglasstheoretic clustering thresholds [Krzakala et al. (22)] have no impact on algorithm performance. This puts forth the question whether the energy landscape of random KSAT for small K is in some regard more elementary than has been previously believed.
Structure of the Energy Landscape.
In the context of random KSAT, it is folklore that SLS algorithms appear to benefit from circumspect descent in energy, that is, from a conservative policy of lowering the number of clauses not satisfied by the trial configuration. To explore this issue further, we introduce a SLS algorithm which we call ChainSAT. It is based on four ideas: (i) focusing, (ii) easing difficulttosatisfy constraints by socalled chaining moves, (iii) restraining the rate of descent, and (iv) never going upward in energy; that is, the number of unsatisfied clauses is a nonincreasing function of the sequence of trial configurations traversed by the algorithm.
By design, ChainSAT cannot escape from a local minimum of energy in the energy landscape. Yet, empirically, ChainSAT is able to find (for the K = 4, 5, 6 studied here) a solution, almost surely in linear time, up to values of α reached by SLS algorithms that are allowed to go up in energy, such as the Focused Metropolis Search (19). This observation further supports the position that random KSAT for small K may be more elementary than has been previously believed.
Results
Experiments with Focused Metropolis Search.
The Focused Metropolis Search (FMS) algorithm (19) is given in pseudocode as follows:

1: S = random assignment of values to the variables

2: while S is not a solution do

3: C = a clause not satisfied by S selected uniformly at random

4: V = a variable in C selected uniformly at random

5: ΔE = change in number of unsatisfied clauses if V is flipped in S

7: if ΔE ≤ 0 then

8: flip V in S

9: else

10: with probability η^{ΔE}

11: flip V in S

12: end with

13: end if

14:end while
This section documents our experiments aimed at charting the empirical lineartime region of FMS on random KSAT for K = 4.
FMS Performance.
In Fig. 1, we present empirical evidence that FMS almost surely runs in time linear in N for instances of random Ksatisfiability at K = 4 and α = 9.6. That the curves get steeper with increasing N implies concentration of solution times, or that above and belowaverage solution times get rarer with increasing N. Note that the scaling implies performance almost surely linear in N and demonstrates that the lineartime regime of FMS extends beyond the predicted (22) spinglass theoretic “dynamical” and “condensation” transitions points.
For K = 3, it has already been established that the FMS algorithm has an “operating window” in terms of the adjustable “temperature” parameter η (19). For toolarge values of η, the linearity (in N) is destroyed due to toolarge fluctuations that keep the algorithm from reaching low energies and the solution. For too small values of η, the algorithm becomes “too greedy,” leading to a divergence of solution times. Thus, to obtain performance linear in N, it is necessary to carefully optimize the parameter η. Such an optimization for K = 4 (the details of which we omit for reasons of space; compare ref. 19) reveals that the operating window for α = 9.60 is at least 0.292 ≤ η ≤ 0.294. In the experiments, we have chosen η = 0.293.
Experiments on xSatisfiability Using FMS.
Our experimental setup to investigate xsatisfiability is as follows. For given values of α and N, we first generate a random KSAT instance and find one reference solution of this instance using FMS. Then, using FMS, we search for other solutions in the same instance. The initial configuration S for FMS is selected uniformly at random from the set of all configurations having a given Hamming distance to the reference solution. When FMS finds a solution, we record the distance x of the solution found to the reference solution.
Our experiments on random KSAT for K = 4 did not reveal any gaps in the xsatisfiability spectrum, even for α = 9.6, beyond the predicted spinglass theoretic “dynamical” and “condensation” transitions points (22). In particular, Fig. 2 gives empirical evidence that solutions are found at all distances smaller than the typical distance of solutions found by FMS. This is in contrast to the numerical results of Battaglia et al. (23) for a balanced version of K = 5.
Here, it should be pointed out that the solutions found by stochastic local search need not be typical solutions in the space of all solutions; there can be other solutions that are not reached by FMS or other algorithms. Evidence of this is reflected in the “whiteness” status of solutions (19, 24); all of the solutions found in our experiments were completely white, that is, they do not have locally frozen variables (25). One can, of course, imagine that a “typical solution” is not white under the circumstances examined here, but, as noted, there is no evidence of the existence of such (compare ref. 26). Fig. 3 summarizes the results of a scaling analysis with increasing N over five random instances and reference solutions. The distance distributions appear to converge to some specific curve without vertical sections, the absence of which suggests that the xsatisfiability spectrum has no gaps below the typical distance of solutions found by FMS in the limit of infinite N.
Fig. 4 summarizes the results of a scaling analysis with increasing α. We see that the typical distance between solutions found by FMS decreases with increasing α, and that no clear gaps are apparent in the distance data.
Experiments with ChainSAT.
A heuristic that never moves up in energy is here shown to solve random Ksatisfiability problems, almost surely in time linear in N, for K = 4, 5, 6. Our heuristic, ChainSAT, is given in pseudocode as follows:

1: S = random assignment of values to the variables

2: chaining = FALSE

3: while S is not a solution do

4: if not chaining then

5: C = a clause not satisfied by S selected uniformly at random

6: V = a variable in C selected u.a.r.

8: end if

9: ΔE = change in number of unsatisfied clauses if V is flipped in S

10: chaining = FALSE

11: if ΔE = 0 then

12: flip V in S

13: else if ΔE < 0

14: with probability p_{1}

15: flip V in S

16: end with

17: else

18: with probability 1 − p_{2}

19: C = a clause satisfied only by V selected u.a.r.

20: V′ = a variable in C other than V selected u.a.r.

21: V = V′

22: chaining = TRUE

23: end with

24: end if

25: end while
The algorithm (i) never increases the energy of the current configuration S and (ii) exercises circumspection in decreasing the energy. In particular, moves that decrease the energy are taken only sporadically compared with equienergetic and chaining moves. The latter are designed to alleviate critically satisfied constraints by proceeding in “chains” of variableclausevariable until a variable is found that can be flipped without increase in energy. Focusing is used for the nonchaining moves. The structure of ChainSAT has the basic idea of helping to flip a variable to satisfy an original broken constraint.
The ChainSAT algorithm has two adjustable parameters, one (p_{1}) for controlling the rate of descent (by accepting energylowering flips) and another (p_{2}) for limiting the length of the chains to avoid looping. In our experiments, we let p = p_{1} = p_{2}.
ChainSAT Performance.
In Fig. 5, we present empirical evidence that ChainSAT almost surely runs in time linear in N for random Ksatisfiability problems with K = 4, 5, 6. That the curves get steeper with increasing N implies concentration of solution times, or that aboveand belowaverage solution times get rarer with N. Because the algorithm never goes uphill in the energy landscape, local energy minima cannot be an obstruction to finding solutions, at least in the region of the energy landscape visited by this algorithm. On the other hand, when ChainSAT fails to find a solution in linear time, this can also result from simply getting lost; in particular, the fraction of moves that lower the energy over those that keep it constant may dwindle to zero.
Fig. 6 illustrates the sensitivity of ChainSAT to the choice of the parameter p as N is increased on random Ksatisfiability problems with K = 4. We observe that as p decreases, the solution times concentrate more rapidly about the mean with increasing N. Furthermore, the mean solution time scales roughly as 1 / p when p is decreased.
Whiteness.
To provide a further empirical analysis of ChainSAT, we next present Fig. 7. This is discussed not in terms of solution times and the range of α achievable with a bit of tuning, but in terms of two quantities: (i) the average chain length l_{chain} during the course of finding a solution and (ii) the average whiteness depth (AWD). In more precise terms, the average chain length is l_{chain} = f / m−1, where f is the total number of iterations of the main loop of ChainSAT and m is the number of times the ifstatement controlled by the chaining flag in the main loop is executed.
The AWD is related to the result of the socalled whitening procedure (24), described in pseudocode as follows:

1: initially all clauses and variables are unmarked (nonwhite)

2: mark (whiten) every clause that is unsatisfied

3: mark (whiten) every clause that has more than one true literal

4: D = 0

5: repeat

6: mark (whiten) any unmarked variables that appear as satisfying literals only in marked clauses

7: if all the variables are marked then

8: declare that S is completely white

9: halt

10: end if

11: if no new variables were marked in this iteration then

12: declare that S has a core

13: halt

14: end if

15: mark (whiten) any unmarked clauses that contain at least one marked variable

16: D = D + 1

17: end repeat
The whitening algorithm is applied to the solution found when ChainSAT terminates. The whiteness depth of a variable is defined as the value of D in the whitening procedure at the time the variable gets marked (whitened); the value is infinite if the variable never gets marked (whitened) during the whitening procedure. The AWD of a solution is the average of the whiteness depths of the variables. See ref. 19 for an empirical discussion of whitening in the context of random KSAT for K = 3. The key observation here is that the solutions found by ChainSAT all have a finite AWD. This in loose terms means that there is “slack” in the solution.
Based on Fig. 7, it is clear that increasing the value of α has the same effect for K = 4, 5, 6: the average chain length l_{chain} increases and so does the AWD. Note that the ratio AWD / l_{chain} increases with α.
Discussion
We have here shown empirically that local search heuristics can be designed to avoid traps and “freezing” in random Ksatisfiability, with solution times scaling linearly in N. This requires that circumspection is exercised; too greedy a descent causes the studied algorithms to fail for reasons unclear. A physicsinspired interpretation is that during a run the algorithm has to “equilibrate” on a constant energy surface.
In terms of the parameter α, it is the pertinent question as to how far the “easy” region from which one finds these solutions extends. For small K, it may be possible that this is true all the way to the satisfiability/unsatisfiability transition point. The empirical evidence we have here presented points toward a divergence of the prefactor of the linear scaling in problem size well below α_{sat}. Furthermore, this divergence is stronger for higher values of K. For large values of K, the absence of traps may, however, in any case be considered unlikely, as the rigorous techniques used to show clustering of solutions for K ≥ 8 (21) can also be used to show that there exist pairs of distant solutions separated by an extensive energy barrier from each other. This suggests also the existence of local minima separated by extensive barriers. On the other hand, our present results for small K give no evidence in this direction. In particular, for K = 4, we have shown empirically that the energy landscapes can be navigated with simple randomized heuristics beyond all so far predicted transition points, apart from the satisfiability/unsatisfiability transition itself.
Our experiments also strongly suggest that the space of solutions for K = 4 at least up to α = 9.6 does not break into multiple clusters separated by extensive distance. All of the solutions found have “slack,” in the sense that they have a finite AWD. Is there an efficient way to find solutions that are not “white” in this sense; put otherwise, is the existence of “white” solutions necessary for “easy” solvability?
The observed success of ChainSAT adds evidence to the longheld belief in computer science that highdimensional search spaces rarely have true local minima. This presents a contrast with the common practice to attribute the effectiveness of methods such as simulated annealing to the method's ability to make “uphill” moves. Our experiments suggest that “horizontal” moves (ΔE = 0) are equally attributable.
All these observations present further questions about the structure of the energy landscape, the solution space, and the workings of algorithms for random CSPs. They also leave us with challenges and constraints to theoretical attempts to understand these, including approaches from the physics of spin glasses.
Acknowledgments
This work was supported by the Integrated Project EVERGROW of the European Union (J.A. and S.K.), the Swedish Science Council through Linnaeus Centre ACCESS (E.A.), and the Academy of Finland (Grant 117499, to P.K.), and through the Center of Excellence Program (M.A. and S.S.). We thank the Department of Computer Science of University of Helsinki and Swedish Institute of Computer Science for the use of a little over 13 years of central processing unit time, and the Kavli Institute for Theoretical Physics China (KITPC) for hospitality.
Footnotes
 ^{¶}To whom correspondence should be addressed. Email: eaurell{at}kth.se

Author contributions: M.A. and E.A. designed research; M.A., J.A., E.A., P.K., and S.S. performed research; S.S. contributed new reagents/analytic tools; M.A., E.A., P.K., S.K., and S.S. analyzed data; and M.A., E.A., P.K., and P.O. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.
 © 2008 by The National Academy of Sciences of the USA
References
 ↵
 Garey MR,
 Johnson DS
 ↵
 Du D,
 Gu J,
 Pardalos P
 ↵
 ↵
 Dubois O,
 Monasson R,
 Selman B,
 Zecchina R
 ↵
 Mitchell D,
 Selman B,
 Levesque H
 ↵
 ↵
 Mézard M,
 Parisi G,
 Zecchina R
 ↵
 Aarts E,
 Lenstra JK
 ↵
 Hoos HH,
 Stützle T
 ↵
 Selman B,
 Kautz H,
 McAllester D
 ↵
 ↵
 Kirkpatrick S,
 Gelatt SD, Jr,
 Vecchi MP
 ↵
 Papadimitriou CH
 ↵
 Barthel W,
 Hartmann AK,
 Weigt M
 ↵
 Semerjian G,
 Monasson R
 ↵
 Selman B,
 Kautz H,
 Cohen B
 Johnson DS,
 Trick MA
 ↵
 Ardelius J,
 Aurell E
 ↵
 Aurell E,
 Gordon U,
 Kirkpatrick S
 ↵
 Seitz S,
 Alava M,
 Orponen P
 ↵
 ↵
 ↵
 Krzakala F,
 Montanari A,
 RicciTersenghi F,
 Semerjian G,
 Zdeborova L
 ↵
 Battaglia D,
 Braunstein A,
 Chavas J,
 Zecchina R
 ↵
 Parisi G
 ↵
 ↵
 Kroc L,
 Sabharwal A,
 Selman B
Citation Manager Formats
Article Classifications
 Physical Sciences
 Computer Sciences