Authors: Spinrath, Christopher
Title: Query evaluation revised: parallel, distributed, via rewritings
Language (ISO): en
Abstract: This is a thesis on query evaluation in parallel and distributed settings, and structurally simple rewritings. It consists of three parts. In the first part, we investigate the efficiency of constant-time parallel evaluation algorithms. That is, the number of required processors or, asymptotically equivalent, the work required to evaluate queries in constant time. It is known that relational algebra queries can be evaluated in constant time. However, work-efficiency has not been a focus, and indeed known evaluation algorithms yield huge (polynomial) work bounds. We establish work-efficient constant-time algorithms for several query classes: (free-connex) acyclic, semi-join algebra, and natural join queries; the latter in the worst-case framework. The second part is about deciding parallel-correctness of distributed evaluation strategies: Given a query and policies specifying how data is distributed and communicated among multiple servers, does the distributed evaluation yield the same result as the classical evaluation, for every database? Ketsman et al. proved that parallel-correctness for Datalog is undecidable; by reduction from the undecidable containment problem for Datalog. We show that parallel-correctness is already undecidable for monadic and frontier-guarded Datalog queries, for which containment is decidable. However, deciding parallel-correctness for frontier-guarded Datalog and constraint-based communication policies satisfying a certain property is 2ExpTime-complete. Furthermore, we obtain the same bounds for the parallel-boundedness problem, which asks whether the number of required communication rounds is bounded, over all databases. The third part is about structurally simple rewritings. The (classical) rewriting problem asks whether, for a given query and a set of views, there is a query, called rewriting, over the views that is equivalent to the given query. We study the variant of this problem for (subclasses of) conjunctive queries and views that asks for a structurally simple rewriting. We prove that, if the given query is acyclic, an acyclic rewriting exists if there is any rewriting at all. Analogous statements hold for free-connex acyclic, hierarchical, and q-hierarchical queries. Furthermore, we prove that the problem is NP-hard, even if the given query and the views are acyclic or hierarchical. It becomes tractable if the views are free-connex acyclic or q-hierarchical (and the arity of the database schema is bounded).
Subject Headings: Query evaluation
Datalog
Acyclic conjunctive queries
Free-connex queries
Hierarchical queries
PRAM
Massively parallel communication
Distributed databases
Parallel evaluation
Parallel-correctness
Work-efficient
Rewriting
Acyclic rewriting
Complexity
NP-hardness
Subject Headings (RSWK): Verteiltes Datenbanksystem
Komplexität
Parallelisierung
Abfrageverarbeitung
Leistungsbewertung
URI: http://hdl.handle.net/2003/42383
http://dx.doi.org/10.17877/DE290R-24219
Issue Date: 2024
Rights link: https://creativecommons.org/licenses/by-nc-sa/4.0/
Appears in Collections:LS 01 Logik in der Informatik

Files in This Item:
File Description SizeFormat 
2024-02-07-dissertation-spinrath-pub.pdfDNB1.91 MBAdobe PDFView/Open


This item is protected by original copyright



This item is licensed under a Creative Commons License Creative Commons