# «A Tractable hypergraph properties for constraint satisfaction and conjunctive queries1 D´niel Marx, Computer and Automation Research Institute, ...»

We want to avoid dealing with assignments b ∈ sol(B) that cannot be extended to any member of sol(A) for some A ⊇ B. Of course, there is no easy way to avoid this in general (or even to detect if there is such a b): for example, if A is the set of all variables, then we would need to check if b can be extended to a solution. Therefore, we require only that

**there is no such unextendable b if A and B are M -small:**

Definition 4.4. A CSP instance I is M -consistent if solI (B) = prB solI (A) for all M -small sets B ⊆ A.

The notion of M -consistency is very similar to k-consistency, a standard notion in the constraint satisfaction literature [Atserias et al. 2007; Dalmau et al. 2002; Kolaitis and Vardi 2000b]. However, we restrict the considered subsets not by the number of variables, but by the number of solutions (more precisely, by considering only M -small sets). Thus the notion of M -consistency could be interpreted in the framework of Greco and Scarcello [2010], where consistency is deﬁned with respect to an arbitrary set of views.

Similarly to usual k-consistency, we can achieve M -consistency by throwing away partial solutions that violate the requirements: if we use the algorithm of Lemma 4.3 to ﬁnd all possible assignments of the M -small sets, then we can check if there is such an unextendable b for some M -small sets A and B. If there is such a b, then we can exclude it from consideration (without losing any solution of the instance) by introducing a new constraint on B.

By repeatedly excluding the unextendable assignments, we can avoid all such problems. We say that I = (V, D, C ) is a reﬁnement of I = (V, D, C) if for every constraint s, R ∈ C, there is a constraint s, R ∈ C such that R ⊆ R.

** Lemma 4.5.**

Let I = (V, D, C) be a CSP instance and M ≥ 1 an integer. There is an algorithm with running time 2O(|V |) · poly( I, M ) that produces an M -consistent CSP instance I that is a reﬁnement of I with sol(I) = sol(I ).

Proof. Using the algorithm of Lemma 4.3, we can ﬁnd all the M -small sets and then we can easily check if there are two M -small sets S ⊆ S violating consistency, i.e., sol(S) ⊆ prS sol(S ). In this case, let s be an |S|-tuple whose coordinates contain S in an arbitrary order and let us add the constraint s, prS sol(S ) ; it is clear that sol(V ) does not change but | sol(S)| strictly decreases. We repeat this step until the instance becomes M -consistent.

Note that adding the new constraint can make a set M -small that was not M -small before, thus we need to rerun the algorithm of Lemma 4.3. To bound the number of iterations before M -consistency is reached, observe that adding a new constraint does not increase | sol(A)| for any A and strictly decreases | sol(S)| for some M -small set S. As there are at most 2|V | sets S and | sol(S)| ≤ M for every M -small set S, it follows that this step can be repeated at most 2|V | · M times. The size of the instance increases in each step by adding a new constraint with at most M tuples, thus the size of the instance at the end of the process can be still bounded by 2O(|V |) · poly( I, M ). Thus the total time required to ensure that instance I is M -consistent can be bounded by 2O(|V |) · poly( I, M ).

We want to avoid degenerate cases where there is no solution even for some M -small sets.

Consistency implies that it is suﬃcient to require this for sets of size 1. We say that a CSP

**instance is nontrivial if sol({v}) = ∅ for every v ∈ V. The following is immediate:**

Proposition 4.6. If I is an M -consistent nontrivial CSP instance, then sol(S) = ∅ for every M -small set S.

It is well known that by achieving k-consistency, we can solve CSP instances with treewidth k: the key observation is that if an instance I with treewidth at most k has a k-consistent nontrivial reﬁnement I, then I has a solution. The following lemma adapts this statement to our setting.

** Lemma 4.7.**

Let I = (V, D, C) be a CSP instance and M ≥ 1 an integer. Let I be an M -consistent nontrivial reﬁnement of I. If the hypergraph H of I has a tree decomposition where every bag B is M -small in I, then I has a solution.

Proof. Suppose that there is such a tree decomposition (T, (Bt )t∈V (T ) ). Assume that T is rooted and for every node t ∈ V (T ), let Vt be the union of the bags that are descendants of t (including Bt ). We claim that every assignment in solI (Bt ) can be extended to an assignment of Vt that satisﬁes every constraint of I whose scope is fully contained in Vt.

Applying this statement to the root of T proves that there exists a solution for I. (Recall that every edge of the hypergraph H, and hence the scope of every constraint, is fully contained in one of the bags.) We prove the claim for every node of T in a bottom up order. The statement is trivial for the leaves. Let t1,..., t be the children of t and suppose the claim is true for these nodes. Consider an assignment g ∈ solI (Bt ). Since I is M -consistent and Bti is M -small, assignment g|Bt ∩Bti can be extended to an assignment gi ∈ solI (Bti ). As the claim is true for node ti, assignment gi can be extended to an assignment gi of Vti. The assignments g,

**g1,..., g can be combined to obtain an assignment g on Vt (note that this is well deﬁned:**

the intersection of Vti and Vtj is in Vt, which means that a variable appearing in both Vti and Vtj has the same value in g, gi, and gj ). Furthermore, every edge e of H that is fully contained in Vt is fully contained in at least one of Bt, Vt1,..., Vt, and the corresponding assignment among g, g1,..., g shows that g satisﬁes the constraint corresponding to e.

Journal of the ACM, Vol. V, No. N, Article A, Publication date: January YYYY.

Tractable hypergraph properties for constraint satisfaction and conjunctive queries A:17 Note the subtle detail that Lemma 4.7 does not claim that I has a solution. Furthermore, when Lemma 4.5 creates an M -consistent instance, then it possibly adds many new constraints and the hypergraph of I can be very dense even if the hypergraph of I has nice structure. However, this is not a problem, as Lemma 4.7 does not require any property on the hypergraph of I. Note also that every M -small set in I is M -small in I as well, thus I has a potentially larger collection of M -small sets, which makes ﬁnding the required tree decomposition of the hypergraph H of I easier.

4.2. Decomposition into uniform CSP instances Our algorithm for decomposing a CSP instance into uniform CSP instances is inspired by a combinatorial result of Alon et al. [2007], which shows that, for every ﬁxed n, an ndimensional point set S can be partitioned into polylog(|S|) classes such that each class is O(1)-uniform. We follow the same proof idea: the instance is split into two instances if uniformity is violated somewhere, and we analyze the change of an appropriately deﬁned weight function to bound the number of splits performed. However, the parameter setting is diﬀerent in our proof: we want to partition into f (|V |) classes, but we are satisﬁed with somewhat weaker uniformity. Another minor technical diﬀerence is that we require uniformity only on the N c -small sets.

**The following deﬁnitions gives the precise notion of uniformity that we use:**

Definition 4.8. Let I = (V, D, C) be a CSP instance. For B ⊆ A ⊆ V and an assignment b : B → D, let solI (A|B = b) := {a ∈ solI (A) | prB a = prB b}, the set of all extensions of b to a solution of prA I. Let maxI (A|B) = maxb∈solI (B) | solI (A|B = b)| (if solI (B) = ∅, then maxI (A|B) = 0). We deﬁne maxI (A|∅) = | solI (A)| and maxI (∅|∅) = 1.

We will drop I from the subscript of max if it is clear from the context.

**Let us prove two straightforward properties of the function max(A|B):**

Proposition 4.9. For every B ⊆ A ⊆ V with sol(A) = ∅ and C ⊆ V, we have (1 ) max(A|B) ≥ | sol(A)|/| sol(B)|, (2 ) max(A|B) ≥ max(A ∪ C|B ∪ C).

Proof. If every b ∈ sol(B) has at most max(A|B) extensions to A, then clearly | sol(A)| is at most | sol(B)| · max(A|B), proving the ﬁrst statement. To show the second statement, consider an x ∈ sol(B ∪ C) with max(A ∪ C|B ∪ C) extensions to A ∪ C. For any two y1, y2 ∈ sol(A ∪ C|B ∪ C = x) with y1 = y2, we have prC y1 = prC y2 = prC x, hence y1 and y2 can be diﬀerent only if prA y1 = prA y2. This means that prA y1 and prA y2 are two diﬀerent extensions of prB x to A. Therefore, max(A|B) ≥ | sol(A|B = prB x)| ≥ | sol(A ∪ C|B ∪ C = x)| = max(A ∪ C|B ∪ C), what we had to show.

Notice that (2) in Prop. 4.9 gives a hint that submodularity will be relevant: it is analogous to inequality (1) (Section 3) expressing that marginal value is larger with respect to a smaller set.

Definition 4.10. We say that A ⊆ V is c-uniform (for some integer c) if sol(A) = ∅ and, for every B ⊆ A, maxI (A|B) ≤ c| solI (A)|/| solI (B)|.

A CSP instance is (N, c, )-uniform if every N c -small set is N -uniform.

That is, A is c-uniform if every solution of solI (B) has at most c times as many extensions as the average number of extensions. The following lemma states the main combinatorial tool of our algorithm: splitting an instance into a constant number of uniform instances. Note

Journal of the ACM, Vol. V, No. N, Article A, Publication date: January YYYY.

Tractable hypergraph properties for constraint satisfaction and conjunctive queries A:19

** Lemma 4.11 guarantees some uniformity for the created instances, but not perfect 1uniformity and only for the N c -small sets.**

Thus in Lemma 4.12, we need to deﬁne b in a slightly diﬀerent and more technical way: we add some small terms to correct errors arising from the weaker uniformity and we truncate the function at large values (i.e., for sets that are not N c -small). Also, we state Lemma 4.12 for an arbitrary hypergraph H, possibly diﬀerent from the hypergraph of I; then to guarantee that b is edge dominated, we need that | sol(e)| ≤ N for every edge e of H. The reason we need the statement in this form is that in an instance Ii produced by Lemma 4.11 from instance I, the maximum size of a constraint relation in I is an upper bound on | solIi (e)| only if e corresponds to a constraint of I, but we have no such bound if e corresponds to a constraint of Ii not appearing in I.

** Lemma 4.12.**

Let I = (V, D, C) be a CSP instance and let H be a hypergraph on V (possibly diﬀerent from the hypergraph of I) such that | sol(e)| ≤ N holds for every e ∈ E(H). If I is N c -consistent and (N, c, 3 )-uniform for some c ≥ 1 and := 1/|V |, then the following function b is an edge-dominated, monotone, submodular function on H with

**b(∅) = 0:**

Proof. Let h(S) := 2 2 |S| − 3 |S|2. The function h(S) is a quadratic function of |S|; it is 0 when |S| = 0 or |S| = 2/, hence its maximum is at |S| = 1/ = |V (H)| with maximum value. Therefore, h(S) is monotone in the range 0 ≤ |S| ≤ |V (H)|. Furthermore, h is a

**submodular function:**

This calculation shows that if |X \ Y |, |Y \ X| ≥ 1, then we actually have h(X) + h(Y ) ≥ h(X ∩ Y ) + h(X ∪ Y ) + 2 3. We will use this extra 2 3 term to dominate the error terms arising from assuming only (N, c, 3 )-uniformity instead of perfect 1-uniformity.

Let us ﬁrst verify the monotonicity of b. If Y is N c -small, then every X ⊆ Y is N c -small, which implies | sol(X)| ≤ | sol(Y )| as I is N c -consistent. Therefore, b(X) ≤ b(Y ) follows from the monotonicity of h. If Y is not N c small, then b(Y ) = (1 − )c + h(Y ) and b(X) ≤ b(Y ) is clear for every X ⊆ Y, no matter whether X is N c -small or not.

To see that b is edge-dominated, consider an edge e ∈ E(H). By assumption, logN | sol(e)| ≤ 1 for every e ∈ E(H) and hence (using N c -consistency and c ≥ 1) e is N c -small. Thus b(e) ≤ (1 − ) + h(S) ≤ 1, as required.

Finally, let us verify the submodularity of b for some X, Y ⊆ V. If X ⊆ Y or Y ⊆ X, then there is nothing to show. Thus we can assume that |X \ Y |, |Y \ X| ≥ 1. We consider three cases depending on which of X and Y are N c -small. Suppose ﬁrst that X and Y are both N c -small. In this case,

Having constructed the submodular function b as in Lemma 4.12, we can use the argument described at the beginning of the section: if H has submodular width at most (1 − )c, then there is a tree decomposition where every bag is N c -small, and we can use this tree decomposition to ﬁnd a solution. In fact, by Lemma 4.7, in this case N c -consistency implies that every nontrivial instance has a solution.

Proof (of Theorem 4.1). Let I be an instance of CSP(H) having hypergraph H ∈ H.

We decide the solvability of I the following way. Let N ≤ I be the size of the largest constraint relation in I, i.e., every constraint has at most N satisfying assignments. Trivially, we have | solI (e)| ≤ N for every e ∈ E(H). Set := 1/|V (H)| (we may assume that |V (H)| ≥ 2), and let c := c0 /(1 − ) ≤ 2c0. Let us use the algorithm of Lemma 4.11 to produce the nontrivial N c -consistent (N, c, 3 )-uniform instances I1,..., It. The running O(|V |) O(|V (H)|) ·c/ · poly( I, N c ), which is 2c0 ·2 time of this step is 22 · I O(c0 ).

If t = 0, then we can conclude that I has no solution. Otherwise, we argue that I has a solution. Consider any Ii ; as Ii is a reﬁnement of I, we have | solIi (e)| ≤ | solI (e)| ≤ N for any e ∈ E(H). Let us use Lemma 4.12 with Ii and H (the hypergraph of I, not Ii !) to deﬁne the edge-dominated monotone submodular function b. By deﬁnition of submodular width, H has a tree decomposition (T, (Bt )t∈V (T ) ) such that b(Bt ) ≤ subw(H) ≤ c0 = (1 − )c for every t ∈ V (T ). Since b(S) ≤ (1 − )c implies | solIi (S)| ≤ N c and b is monotone, this means Journal of the ACM, Vol. V, No. N, Article A, Publication date: January YYYY.

A:22 D. Marx that Bt is N c -small in Ii for every t ∈ V (T ). Therefore, the conditions of Lemma 4.7 hold, and I has a solution.