# «A Tractable hypergraph properties for constraint satisfaction and conjunctive queries1 D´niel Marx, Computer and Automation Research Institute, ...»

For the proof, our strategy is similar to the embedding result of [Marx 2010b]: we show that a highly connected set implies that a uniform concurrent ﬂow exists, the paths appearing in the uniform concurrent ﬂow can be used to embed (a blowup of) the line graph of a complete graph, and every graph has an appropriate embedding in the line graph of a complete graph. To make this strategy work, we need generalizations of concurrent ﬂows, multicuts, and multicommodity ﬂows in our hypergraph setting and we need to obtain results that connect these √ concepts to highly connected sets. Some of these results are similar in spirit to the O( n)-approximation algorithms appearing in the combinatorial optimization literature [Gupta 2003; Hajiaghayi and R¨cke 2006; Agarwal et al. 2007]. However, those approximation algorithms are mostly a based on clever rounding of fractional solutions, while in our setting rounding is not an option: as discussed in Section 5, the existence of a fractional (X, Y )-separator of small weight does not imply the existence of a small integer separator. Thus we have to work directly with the fractional solution and use the properties of the highly connected set.

It turns out that the right notion of uniform concurrent ﬂow for our purposes is a collection of ﬂows that connect cliques: that is, a collection Fi,j (1 ≤ i j ≤ k) of compatible ﬂows, each of value, such that Fi,j is a (Ki, Kj )-ﬂow, where K1,..., Kk are disjoint cliques.

Thus our ﬁrst goal is to ﬁnd a highly connected set that can be partitioned into k cliques in an appropriate way.

Journal of the ACM, Vol. V, No. N, Article A, Publication date: January YYYY.

A:32 D. Marx

6.1. Highly connected sets with cliques Let (X1, Y1 ),..., (Xk, Yk ) be pairs of vertex sets such that the minimum weight of a fractional (Xi, Yi )-separator is si. Analogously to multicut problems in combinatorial optimization, we investigate weight assignments that simultaneously separate all these pairs.

Clearly, the minimum weight of such an assignment is at least the minimum of the si ’s and at most the sum of the si ’s. The following lemma shows that in a highly connected set, such a simultaneous separator cannot be very eﬃcient: roughly speaking, its weight is at least the square root of the sum of the si ’s.

** Lemma 6.3.**

Let µ be a fractional independent set in hypergraph H and let W be a (µ, λ)-connected set for some 0 λ ≤ 1. Let (X1,..., Xk, Y1,..., Yk ) be a partition of W, k let wi := min{µ(Xi ), µ(Yi )} ≥ 1/2, and let w := i=1 wi. Let s : E(H) → R+ be a weight assignment of total weight p such that s is a fractional (Xi, Yi )-separator for every 1 ≤ i ≤ k.

√ Then p ≥ (λ/7) · w.

Proof. Let us deﬁne the function s by s (e) = 6s(e) and let x(v) := e∈E(H),v∈e s (e).

We deﬁne the distance d(u, v) to be the minimum of r∈P x(r), taken over all paths P from u to v. It is clear that the triangle inequality holds, i.e., d(u, v) ≤ d(u, z) + d(z, v) for every u, v, z ∈ V (H). If s covers every u − v path, then d(u, v) ≥ 6: every edge e intersecting a u − v path P contributes at least s (e) to the sum r∈P x(r) (as e can intersect P in more than one vertices, e can increase the sum by more than s (e)). On the other hand, we claim that if d(u, v) ≥ 2, then s covers every u − v path. Clearly, it is suﬃcient to verify this for minimal paths. Such a path P can intersect an edge e at most twice, hence e contributes at most 2s (e) to the sum r∈P x(r) ≥ 2, implying that the edges intersecting P have total weight at least 1 in s. √ Suppose for contradiction that p (λ/7) · w, that is, w 49p2 /λ2. As s is an k (Xi, Yi )-separator, we have that p ≥ 1. Let A := ∅ and B := i=1 (Xi ∪ Yi ). Note that k µ(B) ≥ 2 i=1 wi = 2w. We will increase A and decrease B while maintaining the invariant condition that the distance of A and B is at least 2 in d. Let T be the smallest T integer such that i=1 wi 6p/λ; if there is no such T, then w ≤ 6p/λ, a contradiction.

As wi ≥ 1/2 for every i, it follows that T ≤ 12p/λ + 1 ≤ 13p/λ (since p ≥ 1 and λ ≤ 1).

For i = 1, 2,..., T, we perform the following step. Let Xi (resp., Yi ) be the set of all vertices of W that are at distance at most 2 from Xi (resp., Yi ). As the distance of Xi and Yi is at least 6, by the triangle inequality the distance of Xi and Yi is at least 2, hence s is a fractional (Xi, Yi )-separator. Since W is (µ, λ)-connected and s is an assignment of weight 6p, we have min{µ(Xi ), µ(Yi )} ≤ 6p/λ. If µ(Xi ) ≤ 6p/λ, then let us put Xi (note: not Xi ) into A and let us remove Xi from B. The set Xi, which we remove from B, contains all the vertices that are at distance at most 2 from any new vertex in A, hence it remains true that the distance of A and B is at least 2. Similarly, if µ(Xi ) 6p/λ and µ(Yi ) ≤ 6p/λ, then let us put Yi into A and let us remove Yi from B. Note that we may put a vertex into A even if it was removed from B in an earlier step.

In the i-th step of the procedure, we increase µ(A) by at least wi (as µ(Xi ), µ(Yi ) ≥ wi and these sets are disjoint from the sets already contained in A) and µ(B) is decreased by T at most 6p/λ. Thus at the end of the procedure, we have µ(A) ≥ i=1 wi 6p/λ and µ(B) ≥ 2w − T · 6p/λ 98p2 /(λ2 ) − (13p/λ)(6p/λ) 6p/λ, that is, min{µ(A), µ(B)} 6p/λ. By the invariant condition, the distance of A and B is at least 2, thus s is a fractional (A, B)-separator of weight exactly 6p, contradicting the assumption that W is (µ, λ)-connected.

In the rest of the section, we need a more constrained notion of ﬂow, where the endpoints “respect” a particular fractional independent set. Let µ1, µ2 be fractional independent sets Journal of the ACM, Vol. V, No. N, Article A, Publication date: January YYYY.

Tractable hypergraph properties for constraint satisfaction and conjunctive queries A:33 of hypergraph H and let X, Y ⊆ V (H) be two (not necessarily disjoint) sets of vertices.

A (µ1, µ2 )-demand (X, Y )-ﬂow is an (X, Y )-ﬂow F such that for each x ∈ X, the total weight of the paths in F having ﬁrst endpoint x is at most µ1 (x), and similarly, the total weight of the paths in F having second endpoint y ∈ Y is at most µ2 (y). Note that there is no bound on the weight of the paths going through an x ∈ X, we only bound the paths whose ﬁrst/second endpoint is x. The deﬁnition is particularly delicate if X and Y are not disjoint, in this case, a vertex z ∈ X ∩ Y can be the ﬁrst endpoint of some paths and the second endpoint of some other paths, or it can be even both the ﬁrst and second endpoint of a path of length 0. We use the abbreviation µ-demand for (µ, µ)-demand.

The following lemma shows that if a ﬂow connects a set U with a highly connected set W, then U is highly connected as well (“W can be moved to U ”). This observation will be used in the proof of Lemma 6.5, where we locate cliques and show that their union is highly connected, since there is a ﬂow that connects the cliques to a highly connected set.

** Lemma 6.4.**

Let H be a hypergraph, µ1, µ2 fractional independent sets, and W ⊆ V (H) a (µ1, λ)-connected set for some 0 λ ≤ 1. Suppose that U ⊆ V (H) is a set of vertices and F is a (µ1, µ2 )-demand (W, U )-ﬂow of value µ2 (U ). Then U is (µ2, λ/6)-connected.

Proof. Suppose that there are disjoint sets A, B ⊆ U and a fractional (A, B)-separator s of weight w (λ/6) · min{µ2 (A), µ2 (B)}. (Note that this means µ2 (A), µ2 (B) 6w/λ ≥ 6w.) For a path P, let s(P ) = e∈E(H),e∩P =∅ s(e) be the total weight of the edges intersecting P. Let A ⊆ W (resp., B ⊆ W ) contain a vertex v ∈ W if there is a path P in F with ﬁrst endpoint v and second endpoint in A (resp., B) and s(P ) ≤ 1/3. If A ∩ B = ∅, then it is clear that there is a path P with s(P ) ≤ 2/3 connecting a vertex of A and a vertex of B via a vertex of A ∩ B, a contradiction. Thus we can assume that A and B are disjoint.

Since F is a ﬂow and s has weight w, the total weight of the paths in F with s(P ) ≥ 1/3 is at most 3w. As the value of F is exactly µ2 (U ), the total weight of the paths in F with second endpoint in A is exactly µ2 (A). If s(P ) ≤ 1/3 for such a path, then its ﬁrst endpoint is in A by deﬁnition. Therefore, the total weight of the paths in F with ﬁrst endpoint in A is at least µ2 (A) − 3w, which means that µ1 (A ) ≥ µ2 (A) − 3w ≥ µ2 (A)/2. Similarly, we have µ1 (B ) ≥ µ2 (B)/2. Since W is (µ1, λ)-connected and s is an assignment with weight less than (λ/6) · min{µ2 (A), µ2 (B)} ≤ (λ/3) · min{µ1 (A ), µ1 (B )}, there is an A − B path P with s(P ) 1/3. Now the concatenation of an A − A path PA having s(PA ) ≤ 1/3, the path P, and a B − B path PB having s(PB ) ≤ 1/3 forms an A − B path that is not covered by s, a contradiction.

A µ-demand multicommodity ﬂow between pairs (A1, B1 ),..., (Ar, Br ) is a set F1,..., Fr of compatible ﬂows such that Fi is a µ-demand (Ai, Bi )-ﬂow (recall that a set of ﬂows is compatible if their sum is also a ﬂow, that is, does not violate the edge constraints). The r value of a multicommodity ﬂow is the sum of the values of the r ﬂows. Let A = i=1 Ai, r B = i=1 Bi, and let us restrict our attention to the case when (A1,..., Ar, B1,..., Br ) is a partition of A ∪ B. In this case, the maximum value of a µ-demand multicommodity ﬂow between pairs (A1, B1 ),..., (Ar, Br ) can be expressed as the optimum values of the primal and dual linear programs in Figure 5.

The following lemma shows that if conλ (H) is suﬃciently large, then there is a highly connected set that has the additional property that it is the union of k cliques K1,..., Kk with µ(Ki ) ≥ 1/2 for every clique. The high-level idea of the proof is the following.

Take a (µ, λ)-connected set W with µ(W ) = conλ (H) and ﬁnd a large multicommodity ﬂow between some pairs (A1, B1 ),..., (Ar, Br ) in W. Consider the dual solution y. By complementary slackness, every edge with nonzero value in y covers exactly 1 unit of the multicommodity ﬂow. If most of the weight of the dual solution is on the edge variables, then we can choose k edges that cover at least Ω(k) units of ﬂow. These edges are connected to

Fig. 5. Primal and dual linear programs for µ-demand multicommodity ﬂow between pairs (A1, B1 ),..., (Ar, Br ). We denote by Puv the set of all u − v paths.

W by a ﬂow, and therefore by Lemma 6.4 the union of these edges is also highly connected and obviously can be partitioned into a small number cliques.

There are two things that can go wrong with this argument. First, it can happen that the dual solution assigns most of the weight to the vertex variables y(u), y(v) (u ∈ A, v ∈ B).

The cost of covering the Ai − Bi paths using vertex variables only is min{µ(Ai ), µ(Bi )}, thus this case is only possible if the value of the dual (and hence the primal) solution is close r to i=1 (min{µ(Ai ), µ(Bi )}). To avoid this situation, we want to select the pairs (Ai, Bi ) such that they are only “moderately connected”: there is a fractional (Ai, Bi )-separator of weight 2λ min{µ(Ai ), µ(Bi )}, that is, at most twice the minimum possible. This means that r the weight of the dual solution is at most 2λ i=1 (min{µ(Ai ), µ(Bi )}), which is much less r than i=1 (min{µ(Ai ), µ(Bi )} (if λ is small). If we are not able to ﬁnd suﬃciently many such pairs, then we argue that a larger highly connected set can be obtained by scaling µ by a factor of 2. More precisely, we show that there is a large subset W ⊆ W that is (2µ, λ)-connected and 2µ(W ) conλ (H), a contradiction (a technical diﬃculty here that we have to make sure ﬁrst that 2µ is also a fractional independent set).

The second problem we have to deal with is that the value of the dual solution can be so small that we ﬁnd a very small set of edges that already cover a large fraction of the multicommodity ﬂow. However, we can use Lemma 6.3 to argue that a weight assignment on the edges that covers a large multicommodity ﬂow in a (µ, λ)-connected set cannot have very small weight.

** Lemma 6.5.**

Let H be a hypergraph and let 0 λ 1/16 be a constant. Then there is fractional independent set µ, a (µ, λ/6)-connected set W, and a partition (K1,..., Kk ) of W such that k = Ω(λ conλ (H)), and for every 1 ≤ i ≤ k, Ki is a clique with µ(Ki ) ≥ 1/2.

**Proof. Let k be the largest integer such that conλ (H) ≥ 3T + 2k holds, where T :=**

(56/λ)2 · k 2 ; it is clear that k = Ω(λ conλ (H)). Let µ0 be a fractional independent set and W0 be a (µ0, λ)-connected set with µ0 (W0 ) = conλ (H). We can assume that µ0 (v) 0 if and only if v ∈ W0. This also implies that W0 is in one connected component of H.

Journal of the ACM, Vol. V, No. N, Article A, Publication date: January YYYY.

Tractable hypergraph properties for constraint satisfaction and conjunctive queries A:35 Highly loaded edges. First, we want to modify µ0 such that there is no edge e with µ0 (e) ≥ 1/2. The following claim shows that we can achieve this by restricting µ0 to an appropriate subset W of W0.

** Claim 1. There is a subset W ⊆ W0 such that µ0 (W ) ≥ conλ (H) − k and µ0 (e ∩ W ) 1/2 for every edge e.**

Proof.

Let us choose edges g1, g2,... as long as possible with the requirement µ0 (Ki ) ≥ 1/2 for i−1 Ki := (gi ∩ W0 ) \ j=1 Kj. If we can select at least k such edges, then the cliques K1,..., k Kk satisfy the requirements of Lemma 6.5 and we are done. Indeed, W := i=1 Ki ⊆ W0 is a (µ0, λ)-connected set, µ0 (Ki ) ≥ 1/2, and (K1,..., Kk ) is a partition of W into cliques.

Thus we can assume that the selection of the edges stops at edge gt for some t k. Let t W := W0 \ i=1 Ki. Observe that there is no edge e ∈ E(H) with µ0 (e ∩ W ) ≥ 1/2, as in this case the selection of the edges could be continued with gt+1 := e. Furthermore, we t have µ0 (W ) = µ0 (W0 \ i=1 Ki ) µ0 (W0 ) − k = conλ (H) − k, as required.