FREE ELECTRONIC LIBRARY - Abstracts, online materials

Pages:   || 2 | 3 | 4 | 5 |   ...   | 11 |

«A Tractable hypergraph properties for constraint satisfaction and conjunctive queries1 D´niel Marx, Computer and Automation Research Institute, ...»

-- [ Page 1 ] --


Tractable hypergraph properties for constraint satisfaction and

conjunctive queries1

D´niel Marx, Computer and Automation Research Institute, Hungarian Academy of Sciences (MTA


SZTAKI), Budapest, Hungary, dmarx@cs.bme.hu. 2

An important question in the study of constraint satisfaction problems (CSP) is understanding how the

graph or hypergraph describing the incidence structure of the constraints influences the complexity of the

problem. For binary CSP instances (i.e., where each constraint involves only two variables), the situation is well understood: the complexity of the problem essentially depends on the treewidth of the graph of the constraints [Grohe 2007; Marx 2010b]. However, this is not the correct answer if constraints with unbounded number of variables are allowed, and in particular, for CSP instances arising from query evaluation problems in database theory. Formally, if H is a class of hypergraphs, then let CSP(H) be CSP restricted to instances whose hypergraph is in H. Our goal is to characterize those classes of hypergraphs for which CSP(H) is polynomial-time solvable or fixed-parameter tractable, parameterized by the number of variables. Note that in the applications related to database query evaluation, we usually assume that the number of variables is much smaller than the size of the instance, thus parameterization by the number of variables is a meaningful question.

The most general known property of H that makes CSP(H) polynomial-time solvable is bounded fractional hypertree width. Here we introduce a new hypergraph measure called submodular width, and show that bounded submodular width of H (which is a strictly more general property than bounded fractional hypertree width) implies that CSP(H) is fixed-parameter tractable. In a matching hardness result, we show that if H has unbounded submodular width, then CSP(H) is not fixed-parameter tractable (and hence not polynomial-time solvable), unless the Exponential Time Hypothesis (ETH) fails. The algorithmic result uses tree decompositions in a novel way: instead of using a single decomposition depending on the hypergraph, the instance is split into a set of instances (all on the same set of variables as the original instance), and then the new instances are solved by choosing a different tree decomposition for each of them. The reason why this strategy works is that the splitting can be done in such a way that the new instances are “uniform” with respect to the number extensions of partial solutions, and therefore the number of partial solutions can be described by a submodular function. For the hardness result, we prove via a series of combinatorial results that if a hypergraph H has large submodular width, then a 3SAT instance can be efficiently simulated by a CSP instance whose hypergraph is H. To provethese combinatorial results, we need to develop a theory of (multicommodity) flows on hypergraphs and vertex separators in the case when the function b(S) defining the cost of separator S is submodular, which can be of independent interest.

Categories and Subject Descriptors: F.2 [Theory of Computing]: Analysis of Algorithms and Problem Complexity; G.2.2 [Mathematics of Computing]: Discrete Mathematics—Graph Theory General Terms: Algorithms Additional Key Words and Phrases: constraint satisfaction, hypergraphs, hypertree width, fractional edge covers

1. INTRODUCTION There is a long line of research devoted to identifying hypergraph properties that make the evaluation of conjunctive queries tractable (see e.g. [Gottlob et al. 2002a; Scarcello et al. 2008; Grohe 2006; 2007]). Our main contribution is giving a complete theoretical answer to this question: in a very precise technical sense, we characterize those hypergraph properties that imply tractability for the evaluation of a query. Efficient evaluation of queries is originally a question of database theory; however, it has been noted that the problem can be treated as a constraint satisfaction problem (CSP) and this connection led to a fruitful

–  –  –

interaction between the two communities [Kolaitis and Vardi 2000a; Gottlob and Szeider 2008; Scarcello et al. 2008]. Most of the literature relevant to the current paper use the language of constraint satisfaction. Therefore, after a brief explanation of the databasetheoretic motivation, we switch to the language of CSPs.

Conjunctive queries. Evaluation of conjunctive queries (or equivalently, Select-ProjectJoin queries) is one of the most basic and most studied tasks in relational databases.

A relational database consists of a fixed set of relations. A conjunctive query defines a new relation that can be obtained as first taking the join of some relations and then projecting it to a subset of the variables. As an example, consider a relational database that contains three relations: enrolled(Person, Course, Date), teaches(Person, Course, Year), parent(Person1, Person2). The following query Q defines a unary relation ans(P ) with the meaning that “P is enrolled in a course taught by her parent.” Q : ans(P ) ← enrolled(P, C, D) ∧ teaches(P 2, C, Y ) ∧ parent(P 2, P ).

In the Boolean Conjunctive Query problem, the task is only to decide if the answer relation is empty or not, that is, if the join of the relations is empty or not. This is usually denoted as the relation “ans” not having any variables. Boolean Conjunctive Query contains most of the combinatorial difficulty of the general problem without complications such that the size of the output being exponentially large. Therefore, the current paper focuses on this decision problem.

In a natural way, we can define the hypergraph of a query: its vertices are the variables appearing in the query and for each relation there is a corresponding hyperedge containing the variables appearing in the relation. Intuitively, if the hypergraph has “simple structure,”

then the query is easy to solve. For example, compare the following two queries:

Q1 : ans ← R1 (A, B, C) ∧ R2 (C, D) ∧ R3 (D, E, F ) ∧ R4 (E, F, G, H) ∧ R5 (H, I) Q2 : ans ← R1 (A, B) ∧ R2 (A, C) ∧ R3 (A, D) ∧ R4 (B, C) ∧ R5 (B, D) ∧ R6 (C, D) Even though more variables appear in Q1, evaluating it seems to be easier: its hypergraph is “path like,” thus the query can be answered efficiently by, say, dynamic programming techniques. On the other hand, the hypergraph of Q2 is a clique on 4 vertices and no significant shortcut is apparent compared to trying all possible combinations of values for (A, B, C, D).

What are those hypergraph properties that make Boolean Conjunctive Query tractable?

In the early 80s, it has been noted that acyclicity is one such property [Beeri et al. 1983;

Fagin 1983; Yannakakis 1981; Beeri et al. 1981]. Later, more general such properties were identified in the literature: for example, bounded query width [Chekuri and Rajaraman 2000], bounded hypertree width [Gottlob et al. 2002a], and bounded fractional hypertree width [Marx 2010a; Grohe and Marx 2012]. Our goal is to find the most general hypergraph property that guarantees an efficient solution for query evaluation.

Constraint satisfaction. Constraint satisfaction is a general framework that includes many standard algorithmic problems such as satisfiability, graph coloring, database queries, etc. [Grohe 2006; Feder and Vardi 1999]. A constraint satisfaction problem (CSP) consists of a set V of variables, a domain D, and a set C of constraints, where each constraint is a relation on a subset of the variables. The task is to assign a value from D to each variable in such a way that every constraint is satisfied (see Definition 2.1 in Section 2 for the formal definition). For example, 3SAT can be interpreted as a CSP problem where the domain is D = {0, 1} and the constraints in C correspond to the clauses (thus the arity of each constraint is 3). As another example, let us observe that the k-Clique problem (Is there a kclique in a given graph G?) can be easily expressed as a CSP instance the following way. Let D be the set of vertices of G, let V contain k variables, and let C contain k constraints, one constraint on each pair of variables. The binary relation of these constraints require Journal of the ACM, Vol. V, No. N, Article A, Publication date: January YYYY.

Tractable hypergraph properties for constraint satisfaction and conjunctive queries A:3

that the two vertices are distinct and adjacent. Therefore, the CSP instance has a solution if and only if G has a k-clique.

It is easy to see that Boolean Conjunctive Query can be formulated as the problem of deciding if a CSP instance has a solution: the variables of the CSP instance correspond to the variables appearing in the query and the constraints correspond to the database relations.

A distinctive feature of CSP instances obtained this way is that the number of variables is small (as queries are typically small), while the domain of the variables are large (as the database relations usually contain a large number of entries). This has to be contrasted with typical CSP problems from AI, such as 3-colorability and satisfiability, where the domain is small, but the number of variables is large. As our motivation is database-theoretic, in the rest of the paper the reader should keep in mind that we are envisioning scenarios where the number of variables is small and the domain is large.

As the examples above show, solving constraint satisfaction problems is NP-hard in general if there are no additional restrictions on the input instances. The main goal of the research on CSP is to identify tractable special cases of the general problem. The theoretical literature on CSP investigates two main types of restrictions. The first type is to restrict the constraint language, that is, the type of constraints that are allowed. This direction includes the classical work of Schaefer [1978] and its many generalizations [Bulatov 2006;

2003; Bulatov et al. 2001; Feder and Vardi 1999; Jeavons et al. 1997]. The second type is to restrict the structure induced by the constraints on the variables. The hypergraph of a CSP instance is defined to be a hypergraph on the variables of the instance such that for each constraint c ∈ C there is a hyperedge ec containing exactly the variables that appear in c. If the hypergraph of the CSP instance has very simple structure, then the instance is easy to solve. For example, it is well-known that a CSP instance I with hypergraph H can be solved in time I O(tw(H)) [Freuder 1990], where tw(H) denotes the treewidth of H and I is the size of the representation of I in the input.

Our goal is to characterize the “easy” and “hard” hypergraphs from the viewpoint of constraint satisfaction. However, formally speaking, CSP is polynomial-time solvable for every fixed hypergraph H: since H has a constant number k of vertices, every CSP instance with hypergraph H can be solved by trying all I k possible combinations on the k variables. It makes more sense to characterize those classes of hypergraphs where CSP is easy. Formally, for a class H of hypergraphs, let CSP(H) be the restriction of CSP where the hypergraph of the instance is assumed to be in H. For example, as discussed above, we know that if H is a class of hypergraphs with bounded treewidth (i.e., there is a constant w such that tw(H) ≤ w for every H ∈ H), then CSP(H) is polynomial-time solvable.

For the characterization of the complexity of CSP(H), we can investigate two notions of tractability. CSP(H) is polynomial-time solvable if there is an algorithm solving every instance of CSP(H) in time ( I )O(1), where I is the length of the representation of I in the input. The following notion interprets tractability in a less restrictive way: CSP(H) is fixed-parameter tractable (FPT) if there is an algorithm solving every instance I of CSP(H) in time f (H)( I )O(1), where f is an arbitrary computable function of the hypergraph H of the instance. Equivalently, the factor f (H) in the definition can be replaced by a factor f (k) depending only on the number k of vertices of H: as the number of hypergraphs on k vertices (without parallel edges) is bounded by a function of k, the two definitions result in the same notion. The motivation behind this definition is that if the number of variables is assumed to be much smaller than the the domain size, then we can afford even exponential dependence on the number of variables, as long as the dependence on the size of the instance is polynomial. For a more background on fixed-parameter tractability, the reader is referred to the parameterized complexity literature [Downey and Fellows 1999; Flum and Grohe 2006; Niedermeier 2006].

Journal of the ACM, Vol. V, No. N, Article A, Publication date: January YYYY.

A:4 D. Marx The case of bounded arities. If the constraints have bounded arity (i.e., the edge size in H is bounded by a constant r), then the complexity of CSP(H) is well understood. In

this case, bounded treewidth is the only polynomial-time solvable case:

Theorem 1.1 (Grohe [2007]).

If H is a recursively enumerable class of hypergraphs

with bounded edge size, then (assuming FPT = W[1]) the following are equivalent:

(1 ) CSP(H) is polynomial-time solvable.

(2 ) CSP(H) is fixed-parameter tractable.

(3 ) H has bounded treewidth.

The assumption FPT = W[1] is a standard hypothesis of parameterized complexity. Thus in the bounded arity case bounded treewidth is the only property of the hypergraph that can make the problem polynomial-time solvable. By definition, polynomial-time solvability implies fixed-parameter tractability, but Theorem 1.1 proves the surprising result that whenever CSP(H) is fixed-parameter tractable, it is polynomial-time solvable as well.

The following sharpening of Theorem 1.1 shows that there is no algorithm whose running time is significantly better than the I O(tw(H)) bound of the treewidth based algorithm, and this is true if we restrict the problem to any class H of hypergraphs. The result is proved under the Exponential Time Hypothesis (ETH) [Impagliazzo et al. 2001] stating that there is no 2o(n) time algorithm for n-variable 3SAT, which is a somewhat stronger assumption than FPT = W[1].

Theorem 1.2 (Marx [2010b]).

Pages:   || 2 | 3 | 4 | 5 |   ...   | 11 |

Similar works:

«Northern Beaches Business Education Network CONTENTS Organisational Overview Management and Board Board Chairperson’s Report Executive Officer Report Individual Programme Reports Acknowledgement Of Donors, Sponsors, Supporters, Volunteers Financial Reports Including Audit Reports DIRECTORY Main Office Unit 4&5, 529 Pittwater Road Brookvale NSW 2100 PH 02 9907 3133 FAX 02 99071594 Ingleside Office Granmas Refuge Lot 4 Tumburra Street Ingleside, NSW 2101 PH 02 9986 3339 FAX 02 Executive Officer...»

«Teresa Gillotti, Daniel Kildee: Land Banks as Revitalization Tools Land Banks as Revitalization Tools: The example of Genesee County and the City of Flint, Michigan Teresa Gillotti, Genesee Institute1 Research Fellow & Daniel Kildee, Genesee County Treasurer Genesee Institute, Flint, MI E-mail: tgillotti@gmail.com E-mail: dkildee@sbcglobal.net Introduction Flint, Michigan is a quintessential shrinking city. Based around not just a single industry but a single corporation, Flint’s fortunes...»

«ANNEX 1 – BACKGROUND DOCUMENT TO RAC OPINION ON NITRIC ACID Committee for Risk Assessment RAC Annex 1 Background document to the Opinion proposing harmonised classification and labelling at Community level of nitric acid EC Number: 231-714-2 CAS Number: 7697-37-2 CLH-O-0000002560-82-03/A1 The Background Document (BD) is a compilation of information considered relevant by the dossier submitter or by RAC for the proposed classification. It includes the proposal of the dossier submitter and the...»

«2015/16 National Tariff Payment System: A consultation notice Annex 4a: Additional information on currencies with national prices 26 November 2014 Monitor publication code IRCP 14/14 NHS England Publications Gateway Reference 02457 NHS England Document Classification: Official 2015/16 National Tariff Payment System: A consultation notice Annex 4a: Additional information on currencies with national prices This annex is designed to help with the implementation of the ‘2015/16 National Tariff...»

«Agenda for a Meeting of CLASSIS HAMILTON OF THE CHRISTIAN REFORMED CHURCH Date: February 23, 2016 Time: 2:00 PM – 8:50 PM Venue: Meadowlands Fellowship Christian Reformed Church, 211 Stonehenge Dr., Ancaster L9K 1R4 Officers of Classis: Synodical Deputies: Chair: Rita Klein-Geltink James Dekker Vice Chair: Everett Vander Horst Ronald Fisher Stated Clerk: Dick Kranendonk Herman Praamsma Reporter: Ballot Committee: Immanuel CRC Simcoe Members of Meadowlands CRC, Ancaster Credentials Committee:...»

«BY ORDER OF THE COMMANDER 45TH SPACE WING INSTRUCTION 32-1001 45TH SPACE WING 5 JULY 2012 Civil Engineering FACILITY SPACE ALLOCATION COMPLIANCE WITH THIS PUBLICATION IS MANDATORY ACCESSIBILITY: Publications and forms are available on the e-Publishing website at www.e-publishing.af.mil for downloading or ordering. RELEASABILITY: There are no releasability restrictions on this publication. OPR: 45 CES/CEAO Certified by: 45 MSG/CC (Col Loretta A. Kelemen) Supersedes: 45 SWI32-1001, Pages: 17 1...»

«area handbook series Mauritania 00 a country study DTIC S LECTE I 0APR 1 990 OIIC HLL COPY Approvd for public rollelm, nwi mited.,Din ftu t u : X ek= ;.,;,,. i IIn ! I IA Mauritania a country study Federal Research Division Library of Congress Edited by Robert E. Handloff Research Completed December 1987 On the cover: Pastoralists near 'Ayofin el 'Atrofis Second Edition, First Printing, 1990. Copyright 01990 United States Government as represented by the Secretary of the Army. All rights...»

«Vulnerability Report XYZ Computer Company Prepared By ITMS, LLC Kenneth C. Romer Certified Ethical Hacker Is Your Information Safe From Prying Eyes? We Specialize In Network Security Scans. kcr@itms.us.com / www.itms.us.com Scan Information Host Information DNS Name: -static.hfc.comcastbusiness.net IP: xxxxxxxxx OS: Microsoft Windows Vista Results Summary Critical High Medium Low Info Total Results Details 0/tcp 12053 Host Fully Qualified Domain Name (FQDN) Resolution Synopsis It was possible...»

«Great Plains Studies, Center for Great Plains Research: A Journal of Natural and Social Sciences University of Nebraska Lincoln Year  Sustainability of the Great Plains in an Uncertain Climate William E. Riebsame University of Colorado, william.travis@colorado.edu This paper is posted at DigitalCommons@University of Nebraska Lincoln. http://digitalcommons.unl.edu/greatplainsresearch/10 SUSTAINABILITY OF THE GREAT PLAINS IN AN UNCERTAIN CLIMATE William E. Riebsame Department of...»

«Vietnam 1965, 50 Years On. Colin Geraghty, Service Number O314232, RAAF Caribou Pilot.1. Introduction: 50 Years on, a few recollections of my time as an RAAF Caribou Pilot in the South Vietnam War, during 1965. Some of these situations caused me concern, but were not overwhelming and like everyone else I carried on with my job no matter what happened. Early in the Australian participation in the war our unit, RAAF Transport Flight Vietnam (RTFV), along with a small number of experienced Army...»

«Wildwood School Parent Handbook Policies and Procedures Dear Parents, We would like to welcome you to the “Wildwood Family”! Our “hobbit house” school is nestled among beaver ponds, willows and evergreens on several acres of White River National Forest Land. Our natural setting opens the door for our unique environmental arts program. We stress a very hands-on approach to learning, including varied nature hikes in the summer; gardening in our Grow Dome, and animal tracking on snowshoes...»

«Calhoun: The NPS Institutional Archive Theses and Dissertations Thesis and Dissertation Collection 2005-12 The Revolutionary Armed Forces of Colombia People's Army (FARC-EP) Marxist-Leninist insurgency or criminal enterprise? Saskiewicz, Paul E. Monterey, California. Naval Postgraduate School http://hdl.handle.net/10945/1809 NAVAL POSTGRADUATE SCHOOL MONTEREY, CALIFORNIA THESIS THE REVOLUTIONARY ARMED FORCES OF COLOMBIA – PEOPLE’S ARMY (FARC-EP): MARXIST-LENINIST INSURGENCY OR CRIMINAL...»

<<  HOME   |    CONTACTS
2017 www.abstract.dislib.info - Abstracts, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.